US 6,983,247 B2 | ||
Augmented-word language model | ||
Eric K. Ringger, Issaquah, Wash. (US); and Lucian Galescu, Rochester, N.Y. (US) | ||
Assigned to Microsoft Corporation, Redmond, Wash. (US) | ||
Filed on Nov. 12, 2002, as Appl. No. 10/292,335. | ||
Application 10/292335 is a division of application No. 09/657686, filed on Sep. 08, 2000, granted, now 6,606,597. | ||
Prior Publication US 2003/0083863 A1, May 01, 2003 | ||
Int. Cl. G10L 15/06 (2006.01) |
U.S. Cl. 704—251 | 16 Claims |
1. A method of automatically recognizing speech, comprising:
providing a language model comprising a plurality of n-grams, each n-gram comprising a sequence of n augmented words, each
augmented word comprising a word and a tag encoding lexical information regarding the word, the language model further comprising
a probability indicator corresponding to each n-gram, each probability indicator indicative of a probability that, given an
occurrence of a first n-1 words of the corresponding n-gram in a block of text, an immediately fbllowing word in the block of text will be the nth word of the n-gram;
hypothesizing a sequence of n-1 augmented words, each hypothesized augmented word comprising a hypothesized word and a tag encoding lexical information regarding
the hypothesized word;
comparing the hypothesized sequence to the first n-1 words of a selected n-gram; and
if the hypothesized sequence matches the first n-1 augmented words of the selected n-gram, accessing the probability indicator corresponding to the selected n-gram to determine
the probability that the word immediately following the hypothesized sequence in a block of text will be the nth augmented word of the selected n-gram.
|