Most state-of-the art speech recognition systems incorporate an utterance rejection module to reject speech-like utterances in which the recognition system has low confidence. A situation inspiring low confidence occurs when the speaker employs an open microphone that picks up spurious background noises or when something is uttered or picked-up that is phonetically similar to, but is not, a legitimate phrase. For example in U.S. Pat. No. 5,710,864 assigned to the assignee of the present invention, discriminative techniques are employed wherein both keywords and phonetically similar alternatives (anti-models) are stored in vocabularies and compared with the input speech and an output signal is generated representing a confidence measure corresponding to the relative accuracy of the identification step. The confidence measure uses a likelihood ratio test that is also known as hypothesis testing. In hypothesis testing, the null hypothesis (that the input utterance is correctly recognized), is tested against the alternate hypothesis that the input utterance is not correctly recognized. If the likelihood ratio (normalized by the utterance length) between the null and alternate hypothesis exceeds a critical threshold, the utterance is accepted as correctly recognized, otherwise, the utterance is rejected.
In statistical hypothesis testing, the problem formulation is to test the null hypothesis, H.sub.0, that the input speech utterance O=o.sub.1 .multidot.o.sub.2. . . o.sub.t is correctly recognized, against the alternate hypothesis H.sub.1. If the probabilities for the null and alternate hypotheses are known exactly, then according to the Neyman Person lemma, the optimal test (in the sense of maximizing the power of the test) is usually the probability ratio test such that the null hypothesis, H.sub.0 is accepted if the likelihood ratio between the null and alternate hypothesis exceeds a critical threshold. This criterion, expressed in log domain and normalized by the utterance length is: ##EQU1## where T is the length of the input utterance, log P(O.vertline.H.sub.0) and log P(O.vertline.H.sub.1)are, respectively, the log-probability of the input utterance for the null hypothesis and the alternate hypothesis, and .eta. is the rejection threshold of the test. For testing simple hypotheses where the probability density functions for H.sub.0 and H.sub.1 are known exactly, the likelihood ratio test is the most powerful test for a given level of significance.
We have found that while the above method works well for long utterances it works quite poorly for short utterances. Many more short, out-of-grammar utterances tend to be accepted by the system as having been correctly recognized whereas these utterances should not be accepted.