An empirical study on language model adaptation

2006; Association for Computing Machinery; Volume: 5; Issue: 3 Linguagem: Inglês

10.1145/1194936.1194939

ISSN

1558-3430

Autores

Jianfeng Gao, Hisami Suzuki, Wei Yuan,

Tópico(s)

Speech Recognition and Synthesis

Resumo

This article presents an empirical study of four techniques for adapting language models, including a maximum a posteriori (MAP) method and three discriminative training models, in the application of Japanese Kana-Kanji conversion. We compare the performance of these methods from various angles by adapting the baseline model to four adaptation domains. In particular, we attempt to interpret the results in terms of the character error rate (CER) by correlating them with the characteristics of the adaptation domain, measured by using the information-theoretic notion of cross entropy. We show that such a metric correlates well with the CER performance of the adaptation methods, and also show that the discriminative methods are not only superior to a MAP-based method in achieving larger CER reduction, but also in having fewer side effects and being more robust against the similarity between background and adaptation domains.

Referência(s)