The present invention relates to language models. In particular, the present invention relates to adapting language models based on user input.
Language models provide a measure of the likelihood of a series of words appearing in a string of text. Such models are used in speech recognition, Chinese word segmentation, and phonetic-to-character conversion, such as pinyin-to-hanzi conversion in Chinese, to identifying a most likely sequence of words given a lattice of possible sequences. For example, in speech recognition, a language model would identify the phrase “go to bed” as being more likely than the phonetically similar phrase “go too bed”.
Typically, language models are trained on a corpus of sentences. Although such corpora are effective for training language models to handle general words, they are not very effective for training language models to handle proper nouns such as the names of people and businesses. The reason for this is that proper names do not occur with enough frequency in a corpus to be accurately modeled.
Some systems allow users to correct mistakes made by the language model. However, even after a system knows about the correction, there is no way for the system to adjust the language model based on the correction because there is no way to assess the probability of the word sequence formed by the correction. Because of this, the system will generally make the same mistake later when it encounters the same input.
Thus, a system is needed that allows a language model and a dynamic dictionary to be modified based on corrections made by a user.