1. Field of the Invention
The present invention relates to speech recognition, and more particularly, to a method and apparatus for enhancing the performance of speech recognition by adaptively changing a process of determining a final, recognized word depending on a user's selection in a list of alternative words represented by a result of speech recognition.
2. Description of the Related Art
Speech recognition refers to a technique by which a computer analyzes and recognizes or understands human speech. Human speech sounds have specific frequencies according to the shape of a human mouth and positions of a human tongue during utterance. In other words, in speech recognition technology, human speech sounds are converted into electric signals and frequency characteristics of the speech sounds are extracted from the electric signals, in order to recognize human utterances. Such speech recognition technology is adopted in a wide variety of fields such as telephone dialing, control of electronic toys, language learning, control of electric home appliances, and so forth.
Despite the advancement of speech recognition technology, speech recognition cannot yet be fully accomplished due to background noise or the like in an actual speech recognition environment. Thus, errors frequently occur in speech recognition tasks. In order to reduce the probability of the occurrence of such errors, there are employed methods of determining a final, recognized word depending on user confirmation or selection by requesting the user to confirm recognition results of a speech recognizer or by presenting the user with a list of alternative words derived from the recognition results of the speech recognizer.
Conventional techniques associated with the above methods are disclosed in U.S. Pat. Nos. 4,866,778, 5,027,406, 5,884,258, 6,314,397, 6,347,296, and so on. U.S. Pat. No. 4,866,778 suggests a technique by which the most effectively searched probable alternative word is displayed and if the probable alternative word is wrong, the next alternative word is displayed to find the correct recognition result. According to this technique, a user must separately answer a series of YES/NO questions presented by a speech recognition system and cannot predict which words will appear in the next question. U.S. Pat. Nos. 5,027,406 and 5,884,258 present a technique by which alternative words derived from speech recognition are arrayed and recognition results are determined depending on user's selections from the alternative words via a graphic user interface or voice. According to this technique, since the user must perform additional manipulations to select the correct alternative word in each case after he or she speaks, he or she experiences inconvenience and is tired of the iterative operations. U.S. Pat. No. 6,314,397 shows a technique by which a user's utterances are converted into texts based on the best recognition results and corrected through a user review during which an alternative word is selected from a list of alternative words derived from previously considered recognition results. This technique suggests a smooth speech recognition task. However, when the user uses a speech recognition system in real time, the user must create a sentence, viewing recognition results. U.S. Pat. No. 6,347,296 discloses a technique by which during a series of speech recognition tasks, an indefinite recognition result of a specific utterance is settled by automatically selecting an alternative word from a list of alternative words with reference to a recognition result of a subsequent utterance.
As described above, according to conventional speech recognition technology, although a correct recognition result of user speech is obtained, an additional task such as user confirmation or selection must be performed at least once. In addition, when the user confirmation is not performed, an unlimited amount of time is taken to determine a final, recognized word.