1. Field
Embodiments relate to a dialogue system to intelligently answer a question composed of natural languages of a user, to re-request a dialogue from the user, and thus increase the quality of dialogue with the user, and a dialogue method for use in the system.
2. Description of the Related Art
A dialogue system has been designed to have a conversation or dialogue with a user and carry out a command of the user, and be contained in not only a server and a terminal based on a network, but also a robot, or the like.
A dialogue system is used as an interface either to carry out a conversation with the user or to receive a user command, and may include, a keyboard and a mouse, for example. In order to use the interface, the user moves to a specific place where a keyboard and a mouse are located and manipulates the keyboard or the mouse, such that the user engages in dialogue with the dialogue system and enters a command, resulting in inconvenience of use. If the dialogue system is a robot, it is difficult for the interface to be mounted to the robot due to the mobility of the robot. Therefore, the dialogue system generally uses a speech recognition interface serving as a non-contact interface to interface with the user.
In this case, the speech recognition interface extracts characteristics of user speech, applies a pattern recognition algorithm to the extracted characteristics, allows the user to speak a certain phoneme string or a certain word string, recognizes the user's speech by back-tracking the generated phoneme string or word string, and therefore verbally informs other persons of user-desired information.
The above-mentioned speech recognition for use in the dialogue system has a low speech recognition performance in association with the speech spoken by the user, so that the dialogue system has difficulty in easily having a conversation with the user. In order to solve the above-mentioned problems, a variety of methods of enabling the dialogue system to easily have a conversation with the user have recently been proposed. A representative one of such methods is a domain-based speech recognition method.
The domain-based speech recognition scheme makes a plurality of domains of individual topics (e.g., weather, sightseeing, etc.), generates a specified language model for each domain, performs primary speech recognition of the user speech on the basis of the generated language model to recognize a keyword, performs secondary speech recognition of a domain corresponding to the recognized keyword, and recognizes the intention of the user speech, so that the domain-based speech recognition may have a natural conversation with the user.
In this case, if an unexpected error in the primary speech recognition process occurs, the domain-based speech recognition scheme carries out a secondary speech recognition process using the language model of a domain extracted by a wrongly-recognized keyword without using an additional opportunity of recovering the error, so that it unavoidably encounters the wrong recognition result, resulting in a reduction in speech recognition accuracy.
Also, if a sentence corresponding to a sentence spoken by the user includes a keyword corresponding to two or more domains, the above-mentioned recognition scheme has difficulty in identifying one from among several domains.
In this way, the domain-based speech recognition based on the Language Model (LM) determines a domain using only the speech recognition result. As a result, if a domain search space is very large and an unexpected error occurs during speech recognition, the possibility of a failure in speech recognition is very high, so that the possibility of a failure in recognizing the intention of a user is very high. Although the speech recognition is normally carried out, if the speech recognition result is commonly applied to several domains, it is difficult to determine a domain.