Speech recognition is an imperfect art. Because of all the variables, including differences in microphones, speech accents, and speakers, abilities, it is not possible today to perform recognition of speech with the level of reliability that will satisfy the demands of many applications. There is a need, especially over the telephone, for higher performance speech recognition.
Speech recognition can be enhanced by constraining the words which may be recognized by the system to a small set, such as the set of ten digits from 0-9. Nevertheless, even with the set of ten digits, speaker independent speech recognition remains a difficult problem.
Speech recognition has recently been applied to access control. In any access or entry control system, two key functions are provided by the user: (1) user identification and (2) user verification. The user identification function allows a unknown user to provide a claimed identity. With a speech recognition device, the user may speak his identification code. The verification function is performed on a personal attribute of the user, for example, the user's voice characteristics. Thus, both the identification and verification functions are performed on the same spoken utterance.
Two benefits accrue from a speech recognition/speaker verification capability. First, verification time is reduced considerably, because the input speech data used for identification are also used for verification, thus completely eliminating the input time required for verification. Second, eliminating all but speech inputs provides operational advantages such as freeing hands, which allows the verification terminal to become less expensive and more mobile. In a voice communication system, other auxiliary sensing or verification devices may be eliminated.
In order to perform the identification function using a spoken identification code, speech recognition must be employed. Speech recognition translates the spoken words into a discrete identification code.
In order to provide a useful system, the speech recognition portion of the entry system must be speaker independent, i.e., able to translate the spoken code into words when the speaker is not yet known. Further, for broad application, it must be able to recognize words in connected speech, i.e., speech which does not have pauses between words. These requirements increase the difficulty in providing accurate speech recognition.
Therefore, a need has arisen in the industry to improve the performance of speech recognition systems.