Dialogue processing systems often utilize models of naturally-occurring collaborative dialogues to facilitate interaction between humans and computers. An important feature of such collaborative dialogue models is the initiative, which refers generally to a type of lead role which may be undertaken by a dialogue participant. Participants in a collaborative dialogue will also be referred to herein as agents. Initiative in a typical collaborative dialogue usually shifts among participants in a manner which is signaled by features such as linguistic cues and prosodic cues. In order for an automated dialogue processing system to interact with users in a natural and coherent manner, it must recognize the cues for initiative shifts from the users and provide appropriate cues in its responses to user utterances. Such dialogue processing systems are expected to come into increasingly widespread use in a variety of speech processing applications, including computer interfaces, automated call handlers, automatic teller machines (ATMs), reservation systems, interactive on-line services, and any other application involving human-machine interaction which can be characterized as a collaborative dialogue.
Conventional dialogue processing techniques which track initiative shifts in a collaborative dialogue are generally focused on tracking a single thread of control, typically the conversational lead, among the participants. Such techniques are described in, for example, S. Whittaker and P. Stenton, "Cues and Control in Expert-Client Dialogues," Proc. of the 26th Annual Meeting of the Association for Computational Linguistics, pp. 123-130, 1988, M. Walker and S. Whittaker, "Mixed-Initiative in Dialogue: An Investigation into Discourse Segmentation," Proc. of the 28th Annual Meeting of the Association for Computational Linguistics, pp. 70-78, 1990, H. Kitano and C. Van Ess-Dykema, "Toward a Plan-Based Understanding Model for Mixed-Initiative Dialogues," Proc. of the 29th Annual Meeting of the Association for Computational Linguistics, pp. 25-32, 1991; R. W. Smith and D. R. Hipp, "Spoken Natural Language Dialog Systems--A Practical Approach," Oxford University Press, 1994, and C. I. Guinn, "Mechanisms for Mixed-Initiative Human-Computer Collaborative Discourse," Proc. of the 34th Annual Meeting of the Association for Computational Linguistics, pp. 278-285, 1996, all of which are incorporated by reference herein.
A significant problem with these and other conventional initiative-tracking techniques is that merely maintaining the conversational lead is often insufficient for modeling the complex behavior commonly found in naturally-occurring collaborative dialogues. Consider the following example of a collaborative dialogue in which a number of alternative responses (3a)-(3c) may be given by an advisor A in response to a question from a student S:
(1) S: I want to take NLP to satisfy my seminar course requirement. PA1 (2) S: Who is teaching NLP? PA1 (3a) A: Dr. Smith is teaching NLP. PA1 (3b) A: You can't take NLP because you haven't taken AI, which is a prerequisite for NLP. PA1 (3c) A: You can't take NLP because you haven't taken AI, which is a prerequisite for NLP. You should take distributed programming to satisfy your requirement, and sign up as a listener for NLP.
In response (3a), the advisor A directly responds to S's question, and the conversational lead therefore remains with S. In responses (3b) and (3c), A takes the lead by initiating a subdialogue to correct S's invalid proposal. However, the above-noted conventional collaborative dialogue models maintain only the conversational lead, and arc therefore unable to distinguish between the different types of initiatives in responses (3b) and (3c). For example, in response (3c), A actively participates in the planning process by explicitly proposing alternative actions, whereas in (3b), A merely conveys the invalidity of S's proposal. This example illustrates that it is desirable to distinguish between task initiative, which tracks the lead in the development of a plan, and dialogue initiative, which tracks the lead in determining the current focus of the discourse. The distinction allows A's behavior to be explained from a response generation point of view: in (3b), A responds to S's proposal by merely taking over the dialogue initiative, i.e., informing S of the invalidity of the proposal, while in (3c), A responds by taking over both the task and dialogue initiatives, i.e., informing S of the invalidity and suggesting a possible remedy. A dialogue processing system configured using conventional initiative tracking is unable to recognize the distinction between task and dialogue initiative, and therefore also unable to determine under what circumstances to generate the different types of responses represented by (3b) and (3c) above.
In general, a given agent is said to have the task initiative if that agent is directing how any agent's task should be accomplished, i.e., if the given agent's utterances directly propose actions that an agent should perform. The utterances may propose domain actions that directly contribute to achieving an agent's goal, such as "Let's send engine E2 to Corning." On the other hand, the given agent may propose problem-solving actions that do not contribute directly to an agent's goal, but instead to how an agent would go about achieving this goal, such as "Let's look at the first problem first." An agent is said to have the dialogue initiative if that agent takes the conversational lead in order to establish mutual beliefs between themselves and another agent, such as mutual beliefs about a piece of domain knowledge or about the validity of a proposal. For instance, in responding to a proposal from an agent proposing to send a boxcar to Coming via Dansville, another agent may take over the dialogue initiative, but not the task initiative, by saying "We can't go by Dansville because we've got Engine 1 going on that track." Thus, when an agent takes over the task initiative, that agent also takes over the dialogue initiative, since a proposal of actions can be viewed as an attempt to establish the mutual belief that a set of actions be adopted. On the other hand, an agent may take over the dialogue initiative but not the task initiative, as in response (3b) above. As noted previously, conventional techniques for tracking initiative in collaborative dialogues are unable to accommodate this important distinction.
It is therefore apparent that a need exists for dialogue processing techniques which are capable of differentiating and tracking the above-described task and dialogue initiatives in an efficient manner suitable for use in a variety of practical applications.