The present invention relates generally to spoken dialogue systems, and more particularly to a method and apparatus for portable dialogue management in a spoken dialogue system.
Rapid progress in computer technology makes it possible for a human user to communicate with a computer using spoken dialogue. The function of a spoken dialogue system is to provide a method that allows a human user to communicate with a computer by means of natural language words. A spoken dialogue system can be used in many different applications such as conversational systems for weather inquiring, railroad information access or city guide, . . . , etc.
FIG. 1 illustrates the general architecture of a typical spoken dialogue system. The typical spoken dialogue system includes five modules, i.e., speech recognition module 101, language understanding module 102, dialogue management module 103, language generation module 104, and speech synthesis module 105.
The spoken dialogue system first converts the speech input into sentences using speech recognition module 101. Secondly, the language understanding module 102 makes use of a vocabulary set, grammar rules and semantic knowledge on the language to represent the semantic meaning of the sentences. Based on the semantic representation, the dialogue management module 103 takes appropriate actions and passes a response semantic frame to the language generation module 104. The language generation module 104 generates appropriate sentences in the target language from the semantic frame. According to the generated sentences from the language generation module 104, the speech synthesis module 105 finally synthesizes speech and provides appropriate responses to the user.
The dialogue management module is the kernel for controlling the dialogue flow between a user and a spoken dialogue system. In different applications, the system may be engaged in different dialogues between a user and the system. Therefore, a dialogue management module is the module that has to deal with many domain-dependent characteristics in a spoken dialogue system. In developing a spoken dialogue system, it is very important to have a portable dialogue manager that can be easily ported to a different domain.
There have been several approaches to designing a dialogue manager embodied in dialogue systems. For example, Glass et al. presented xe2x80x9cMultilingual Spoken-language Understanding in the MIT Voyager Systemxe2x80x9d in Speech Communication, Vol. 17, No. 1-18, 1995. A spoken dialogue system can be quickly constructed using Glass""s approach. However, in the design it is necessary to modify the dialogue manager when the interaction over the dialogue is changed. In other words, the dialogue manager is not portable.
An alternative approach is to develop a dialogue manager based on a finite state network model. The following are some of the arts on the subject:
Kita K. et al., xe2x80x9cAutomatic Acquisition of Probabilistic Dialogue Modelsxe2x80x9d, Proceedings of ICSLP""96, pp. 196-199, Philadelphia, USA, 1996.
Levin et al., xe2x80x9cUsing Markov Decision Process for Learning Dialogue Strategiesxe2x80x9d, Proceedings of ICASSP""98, pp. 201-203, Seattle, USA, 1998.
Colton et al. xe2x80x9cA Laboratory Course for Designing and Testing Spoken Dialogue Systemsxe2x80x9d, Proceedings of ICASSP""96, pp. 1129-1132, Atlanta, Calif., 1996.
In the above arts, a dialogue management technology is used to collect all possible dialogues and messages from other system resources through the whole interactions with the user. Then, a network having plurality of nodes is formed based on the dialogues and messages. The network connection is constructed according to the relationship among the nodes. The complete control over the dialogue is then handled directly by the network.
Such a dialogue manager is portable. However, the dialogue manager can only manage dialogue that is highly structured so that it can define all possible states over the dialogue and the connections among the states. In addition, the complete control over the dialogue is directed along the path defined in the network. Therefore, the dialogue manager is more suited in a system in which the dialogue flow is system-initiated.
Another approach is to develop a dialogue manager based on a form-based model. For example, Goddeau et al. disclosed a method that designs a form for needed information in xe2x80x9cA Formed-Based Dialogue Manager for Spoken Language Applicationsxe2x80x9d, Proceedings of ICSLP""96, pp. 701-704, Philadelphia, USA, 1996. The user input is used to fill in corresponding fields of the form. Once a field in the form is filled in, the system responds with a corresponding action. For example, requesting the user for more information to fill in other fields. The user may fill in the form with the information according to the system""s prompt or by any order. This type of mixed-initiative dialogue is limited to a goal-specific spoken dialogue system, such as accessing some information from a large database for a user""s interest.
An alternative approach is to develop a dialogue manager based on a tree-structured model. The following are some of the arts on the subject:
U.S. Pat. No. 5,694,558 granted to Sparks et al. entitled xe2x80x9cMethod and System for Interactive Object-Oriented Dialogue Managementxe2x80x9d.
Camineo-Gil et al., xe2x80x9cData-Driven Discourse Modeling for Semantic Interpretationxe2x80x9d, Proceedings of ICASSP""96, pp. 401-404, Atlanta, Calif. 1996.
Masahiro et al., xe2x80x9cA Cooperative Man-Machine Dialogue Model for Problem Solvingxe2x80x9d, Proceedings of ICSLP""94, pp. 883-886, Yokohama, JP, 1994.
In the above art, the dialogue management technology is based on a task-oriented dialogue model. In the approach, the system task includes a set of subtasks. Below each subtask is a set of smaller tasks. The dialogue plan is implemented as a tree structure. Controlling over the dialogue is like searching nodes of the tree, and can not be switched among the subtasks at the same level. Therefore, the dialogue plan tends to be inflexible.
Another approach is to develop a dialogue manager based on a table-driven model. For example, Stephanie Seneff discloses a method that provides a set of variables representing dialogue states in xe2x80x9cDiscourse and Dialogue Modeling in the GALAXY Systemsxe2x80x9d, Seminar of Spoken Dialogue System and Discourse Analysis, pp. 12-24, Taipei, Taiwan, ROC, 1997. The expressions performed on the variables trigger the system actions based on default rules such as Boolean operations, arithmetic operations or string comparisons. A table consisting of the well-defined variables, rules and system actions represents the complete flow over the dialogue. The values of the variables may vary during the dialogue execution. This will cause different rules to trigger the corresponding actions. Therefore, it is a dialogue system of a mixed-initiative type.
In the method, the complete flow over the dialogue system is described in a table. It can not provide a structural description for the dialogue system. When the dialogue system tries to achieve multiple subjects, all the variables on the subjects should be specified in the same table. In practice, it may happen that the variables used by some subjects be not used by other subjects. In such a situation, the degree of complexity becomes too much for one single table, and it is thus hard to maintain the table.
A similar approach to the table-driven model to make the dialogue manager portable is based on a task description table (TDT) 106 as shown in FIG. 1.
Other approach is to develop a dialogue manager based on a dialogue-state or a stack model. For example, Mark-Jan Nederhof et al. presented xe2x80x9cGrammatical Analysis in the OVIS Spoken Dialogue Systemxe2x80x9d based on dialogue states in Proceedings Workshop sponsored by the Association for Computational Linguistics, pp. 66-73, Madrid, Spain, 1997. Emiel Krahmer et al. presented xe2x80x9cHow to Obey the 7 Commands for Spoken Dialogue?xe2x80x9d based on stacks in Proceedings Workshop Sponsored by the Association for Computational Linguistics, pp. 82-89, Madrid, Spain, 1997. Dialogue states or stacks are used to record the whole dialogue flow for providing the needed information of the control flow over the dialogue.
In addition, Rajeev Agarwal discloses a technology that divides a dialogue manager into two layers in xe2x80x9cTowards a PURE Spoken Dialogue System for Information Accessxe2x80x9d, Proceedings Workshop sponsored by the Association for Computational Linguistics, pp. 90-97, Madrid, Spain, 1997. In order to be ported to different domains for a spoken dialogue system, one layer is used to process domain-dependent dialogue states and the other layer is used to process domain-independent dialogue states.
The present invention has been made to overcome the above mentioned drawbacks of a conventional dialogue manager. The primary object of the invention is to provide a domain transparent dialogue manager that has a standard control mechanism. Accordingly, the portable dialogue management system of the invention comprises a dialogue manager and a hierarchical task description table (HTDT). The dialogue manager manages dialogue states of a dialogue system, selects appropriate dialogue states, and executes the response actions according to the selected dialogue states. The hierarchical task description table stores the dialogue states and defines dialogue strategy of the dialogue system.
According to the invention, the domain-dependent factors related to application domains are extracted out of the dialogue manager to form an external knowledge base, so that the control mechanism can be standardized. The dialogue manager controls the dialogue flow according to semantic input of a user and the instructions provided by the external knowledge base to generate semantic output in response. When the application domain changes, the external knowledge base is replaced instead of changing the dialogue manager. This solves he portability problem and lowers the cost of porting to a different domain.
Another object of the invention is to provide a hierarchical task description table that allows the external knowledge base to be easily designed and maintained. The HTDT can describe multi-goal dialogue flow and mixed-initiative type of dialogue embodied in a dialogue system. This designed dialogue flow is more modularized, sharable and easier to maintain and update the dialogue flow. Furthermore, because the external knowledge base has the characteristic that allows the system developer to predict the dialogue flow, it gives the system development a great deal of flexibility.
In one embodiment of the dialogue manager of this invention, a public transportation service system serves as a multi-goal dialogue system. It provides the ticket order services for the aircraft, the railroad and the bus. Each service is an independent subtask. A subtask is divided into three smaller tasks. They are ticket order, time table query and fare query.
The foregoing and other objects, features, aspects and advantages of the present invention will become better understood from a careful reading of a detailed description provided herein below with appropriate reference to the accompanying drawings.