1. Field of the Invention
The present invention relates to speech recognition technology, and more particularly, to speech recognition technology in which in order to handle any ambiguous portion which may be erroneously recognized during speech recognition, a speaker is questioned about contents of the ambiguous portion, and any ambiguous portion identified is cleared according to the speaker's response to the question.
2. Description of the Related Art
In conventional continuous speech recognition systems, although there may exist a region which may be erroneously recognized during speech recognition, no consideration has been given to the possibility that an error may be made, leading the recognition result to have low confidence. Even when attempts are made to estimate and eliminate any region in which acoustic or semantic errors may be caused, only results determined unilaterally in accordance with internal rules of the system are output. However, the internal rules of the system are very incomplete, resulting in a high error rate. As such, the speech recognition system does not have 100% accuracy. Thus, it is necessary to provide a method capable of enhancing low accuracy in a speech recognition rate in a spoken dialogue system.
Korean Patent Unexamined Publication No. 2001-0086902, titled “HUMAN RESPONSE-BASED SPEECH RECOGNITION APPARATUS”, includes an ambiguity range extractor extracting ambiguity range from a sentence and a question generator generating questions to eliminate the ambiguity range. However, this invention does not consider how to ask the user questions in order to hold a successful and efficient dialogue between a speech recognition system and the user. In order to enhance intelligence, performance and convenience of the speech recognition system, phenomena generated in dialogues between human beings should be analyzed, thereby making the system such that efficiency, effectiveness and flexibility of the dialogue can be increased, as people utter.
In addition, U.S. Pat. No. 6,567,778, titled NATURAL LANGUAGE SPEECH RECOGNITION USING SLOT SEMANTIC CONFIDENCE SCORES RELATED TO THEIR WORD RECOGNITION CONFIDENCE SCORES, employs a method to form slots from results of speech recognition using information on specifications which an application program requires. The method determines a slot confidence score for each slot, such that when the slot confidence score is low, the user is questioned about the slot having the low slot confidence score. Since this method is highly dependent upon the application program, difficulty using the application program may cause problems with the method. For example, when an application program provides a plurality of domains at the same time, such as when the application program performs daily dialogue rather than task-oriented dialogue, or when the dialogue initiative is not taken only by a system, but by the user and the system, it is difficult to form slots. Therefore, it is difficult to use the method.
The foregoing techniques have no alternative to solve further failure problems when the speech recognition is not successful although the user is asked a question again. In such cases, it may be impossible to handle a command from the user. Therefore, in the spoken dialogue system or speech recognition system in which user's requests are handled by conducting a dialogue between the user and the system using speech as an interface, there is a need of a method capable of handling repeated errors when the errors in speech recognition are repeatedly generated.