One of the better detectors for determining which keyword is present in a given utterance is the Hidden Markov Model (HMM) detector. In the HMM detector, digitized speech is analyzed against statistical models of the set of desired keywords (i.e., "one", "two", "three", etc.) and a score is determined for each of the keywords. The keyword with the highest score is considered to be the "best fit". There is, however, no reliable method for detecting whether a valid keyword has actually been spoken. One method of detecting keywords is to compare the HMM scores to a predetermined threshold. This method represents an initial approach to incorporating rejection in a speech recognizer.
In order to acheive more reliable results, the HMM scores have to be further analyzed (using linear discrimination analysis, for example) to determine whether any of the scores is above a threshold of "goodness" before a determination can be made that a specific utterance was in fact the keyword. However, HMM scores vary widely according to length of the verbal utterance, the speaker, and the speaking environment (i.e., the voice path, the microphone, etc.), which is not correctable even by post-processing with linear discrimination analysis and, thus, a simple comparison with a threshold may not be accurate. Therefore, a problem in the art is that HMM alone or with a linear discrimination analysis threshold determination is not well suited to establish a good test for acceptance of the highest-scoring keyword model.