The concept of verbal utterance acceptance and rejection is increasingly prevalent in a wide variety of technologies and products that have been emerging in recent years. For instance, one technology gaining significantly in popular acceptance and use is automatic telephone dialing whereby, upon uttering a keyword or keyphrase such as “Mom”, “Office”, “Dr. Smith”, etc., an appropriate telephone number corresponding to the keyword/keyphrase will automatically be dialed, thus obviating the need for the user to have committed the number to memory or to have looked it up. A distinct advantage in comparison with keypad-type memory-based dialing systems, in which a commonly used number can be automatically dialed by pushing one or a few buttons on a telephone, is that such shortcuts do not have to be consciously looked up or committed to memory, either. Other applications of verbally prompted commands are of course prevalent and contemplated, and their use is bound to increase with the development of additional technologies and products that are well-suited for such commands.
Conventional methods and apparatus for verifying spoken passwords and sentences employ “acoustic likelihoods” resulting from a decoding process. An acoustic likelihood is the probability that a spoken password or sentence actually matches a given target password or sentence.
Conventionally, acoustic likelihoods are typically normalized on an utterance basis, while predetermined thresholds are applied for purposes of verification (i.e., should a verbal utterance meet a certain threshold in terms of the degree to which it matches a target word or phrase, based on given factors, it is construed as sufficiently matching the target word or phrase).
A verbal approach in the vein of the above is in U.S. Pat. No. 5,717,826 (Lucent Technologies, Inc.). In this case, however, a full decoder is used to obtain the transcription of the keywords. The password modeling is done outside of the decoder in a second stage.
Similar arrangements are disclosed elsewhere which, in turn, tend not to solve problems and address issues in a manner that may be presently desired. U.S. Pat. No. 5,465,317, entitled “Speech recognition system with improved rejection . . . ”, discloses a threshold-based technique based on acoustic likelihoods, as does U.S. Pat. No. 5,613,037, entitled “Rejection of non-digit strings for connected digit speech recognition”.
In view of the foregoing, a need has been recognized in conjunction with improving upon previous efforts in the field and surmounting their shortcomings discussed heretofore.