The invention relates to machine recognition of spoken words. More particularly, the invention relates to methods and apparatus for generating machine models of spoken words, and articles for configuring machines to perform such methods.
In a speech recognition machine, each word in the machine vocabulary is represented by a set of one or more models. When a user desires to add a new word to the vocabulary of the speech recognizer, at least one model corresponding to the new word must be generated.
A method of generating a speech recognition model of a word based on the spelling of the word and one utterance of the word is described in an article by J. M. Lucassen et al entitled "An Information Theoretic Approach to the Automatic Determination of Phonemic Baseforms" (Proceedings of the 1984 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 3, pages 42.5.1-42.5.4, March 1984).
An unrecognized problem in the Lucassen et al method occurs if the user utters the new word multiple times. Each utterance of the new word will likely generate a different model. Since it will likely be impractical to store all of the word models generated by all of the utterances of the new word, there is a need to select a subset of one or more word models for the new word.