The job of a language model is to make available to a speech recognizer estimates of probabilities of sequences of words. State of the art language models are known as trigram models. Trigram models predict the probability of a word by using the previous two words. The probability of a sequence of words is given by the product of the probability of each word given the previous two words. The probability of a word w given the previous two words x y is estimated from a training corpus of text as the number of the words x y w that occurred in that sequence divided by the number of times that the words x y occurred together.
Even for modest vocabularies, this estimate is poor in that a large number of trigrams will never be seen in training. Thus, state of the art language models attempt to smooth these probabilities using bigram, unigram and uniform probability distributions. However, the method used for smoothing can influence the overall quality of the model especially for small amounts of training data.
An alternative to the trigram language models described above is to have a fixed finite grammar of utterance, and only allow sequences in the grammar to be recognized. This scheme is highly restrictive for natural language applications in which there is no way to tabulate all ways that a user might convey a certain concept.
Thus, there is a need for techniques that provide improved language models for use by a speech recognizer.