A. Field of the Invention
The present invention relates generally to methods and systems for recognizing speech, and, more particularly, to voice recognition methods and systems for evaluating candidate utterances before enrolling them in a recognition dictionary.
B. Description of the Related Art
In recent years, voice recognition systems have become both more popular and more sophisticated. Voice Activated Dialing (VAD) systems, for example, use voice recognition systems to speed dial or access a party based upon a single voice command. In a VAD system, the voice command is typically the name of the party the user wishes to call, such as the phrase "Bob Johnson." The VAD system then accesses a pre-programmed dictionary containing telephone numbers associated with each pre-programmed party. Once the VAD system locates the entry for "Bob Johnson," it will then call the corresponding telephone number.
One problem with VAD systems is that users will sometimes enter the same phrase or a confusingly similar phrase into the dictionary. For instance, if a user enters the phrase "Bob Johnson" when another party by that name already exists in the dictionary, the wrong telephone number may be called depending upon which "Bob Johnson" the user intends to reach. An error may also occur if the user adds a confusingly similar name, such as "Bob Johnston." Thus, when the user tries to call "Bob Johnston," the VAD system may erroneously dial the number for "Bob Johnson" instead.
U.S. Pat. No. 5,452,397 discloses a VAD system which accounts for the storage of phrases that are the same or confusingly similar to previously enrolled phrases. According to this method, the user must utter a candidate phrase two times to store that phrase in the dictionary. The system stores the first utterance of the candidate phrase in the dictionary and assigns that phrase, and each previously existing phrase in the dictionary, a probability representing the likelihood that each respective phrase will be the same as the second utterance of the candidate phrase said by the user. The '397 patent discloses that the candidate phrase is assigned a lower probability (e.g. 0.8) than that assigned to the other phrases already enrolled in the dictionary (e.g. 1.0). This augments or skews the dictionary toward the likelihood that the second utterance of the candidate phrase will be recognized as one of the previously enrolled phrases rather than the first utterance of the candidate phrase. The VAD system then receives the second utterance of the candidate phrase and compares it to each of the phrases enrolled in the augmented dictionary to determine whether the candidate phrase should be enrolled.
One problem with the method and system disclosed in the '397 patent is that by assigning a lower probability to the first utterance of the candidate phrase, the dictionary is biased towards determining that the candidate phrase will be confusing. This, in turn, increases the probability of false rejections of phrases sought to be enrolled in the VAD system. Another problem with the '397 patent is that when a phrase is found to be confusing, the user is required to repeat the entire entry process over again in order to try to store the candidate phrase. The approach disclosed in the '397 patent also fails to ensure that the first and second utterances of the candidate phrase are consistent with one another. Thus, an error may occur when the two utterances are said differently by the user. There is a need, therefore, for a voice recognition system which can accurately determine the confusability or inconsistency of a candidate phrase in a user-friendly environment.