Natural language conversations often involve multiple exchanges when discussing a single topic, thereby creating a dialogue. Dialogues basically involve: setting a context, asking and verifying, and managing follow-ups and continuations. Human-Computer-Interaction is no different. Background information summarizing some of the work done to date on the topic of dialoguing in natural language processing can be found in Barbara Grosz, “Discourse and Dialogue,” Chap. 6 of R. Cole, J. Mariani, H. Uszkoreit, G. B. Varile, A. Zaenen, A. Zampolli, V. Zue, eds., “Survey of the State of the Art in Human Language Technology,” Cambridge University Press and Giardini (1997), incorporated herein by reference.
U.S. Pat. No. 6,144,989, incorporated by reference herein, describes an adaptive agent oriented software architecture (AAOSA), in which an agent network is developed for the purpose of interpreting user input in a distributed manner as commands and inquiries for a back end application, such as an audiovisual system or a financial reporting system. An AAOSA agent network is a network of agents, each (or most) of which contain one or more “interpretation policies” that describe the agent's function in the distributed parsing operation. An interpretation policy includes, among other things, a policy condition and a policy action. When an agent receives a message from another agent to attempt to interpret an input string, it compares the input string to each of the agent's policy conditions in sequence. If a condition does apply to the input string, or to part of the input string, then the policy makes a “claim” on the applicable portion of the input string, and returns the claim to the agent that requested the interpretation. A claim identifies (among other things) the agent and policy which is making the claim, the portion of the input string to which the claim applies (called the claim “focus”), the priority number of the policy, and also a confidence level which indicates how well the input matches the policy condition. The priority and confidence levels, and the focus, all can be used subsequently by upchain agents for comparing all claims made by the agent, so as to permit the agent to select a “best” one among competing claims.
Policy conditions in AAOSA are typically written as expressions made up from operators and operands. The various operators include unary operators such as <exists>, <exact>, <substring>, <accent>, <accent-substring>, REPEAT and PLUS. They also include binary operators such as OR, AND, ORDERED, ADJACENT and AMBIGUITY. The operands on which an operator can act include tokens (words, strings, numbers, symbols, delimiters), text files, databases, and claims made by other policies. If a first policy condition (the “referencing policy condition”) refers to a second policy (the “referenced policy”) previously evaluated in the same agent, then any claim made by the referenced policy can be figured into the evaluation of the referencing policy condition in the manner specified by the operators. If a policy condition refers to another agent (the “referenced agent”) downchain of the current agent (the “referring agent”), then the claim or claims returned by the referenced downchain agent are figured into the evaluation of the referencing policy condition in the manner specified by the operators. Note that a policy condition that references a downchain agent cannot be completely resolved until the input string is passed to that other agent for comparing to its own policy conditions. In one embodiment, the referencing agent passes the input string to each downchain agent only upon encountering the agent's name while evaluating a policy condition. In a typical embodiment, however, the referencing agent passes the input string to all downchain agents mentioned in any policy condition in the referencing agent, before the referencing agent begins evaluating even its first policy condition.
Thus it can be seen that in a typical AAOSA network, interpretation of the user's intent takes place in an agent network in a distributed manner. Each of the agents in the agent network can be thought of as having a view of its own domain of responsibility, as defined by its interpretation policies. Typically the application domain is organized by the designer into a hierarchy of semantic sub-domains, and individual agents are defined for each node in the semantic hierarchy. The network is also typically organized so as to include a Top agent, responsible for receiving input and initiating queries into the network. Agents representing the functionality of the system (the agents constructing their actuation sub-strings without reference to further agents) typically are the lowest order nodes (leaf agents) of the network.
A typical AAOSA network operates in two main phases: the interpretation phase (also called the claiming phase) and the delegation phase (also called the actuation phase). In the interpretation phase, an initiator agent (such as the Top agent) receives the input token sequence from a user Interaction agent and, by following the Top agent's policy conditions, queries its downchain agents whether the queried agent considers the input token sequence, or part of it, to be in its domain of responsibility. Each queried agent recursively determines whether it has an interpretation policy of its own that applies to the input token sequence, if necessary further querying its own further downchain agents in order to evaluate its policy conditions. The further agents eventually respond to such further queries, thereby allowing the first-queried agents to respond to the initiator agent. The recursive invocation of this procedure ultimately determines a path, or a set of paths, through the network from the initiator agent to one or more leaf agents. The path is represented by the claim(s) ultimately made by the initiator agent. After the appropriate paths through the network are determined, in the delegation phase, delegation messages are then transmitted down each determined path, in accordance with the action parts of winning policies, with each agent along the way taking any local action thereon and filling in with further action taken by the agents further down in the path. The local action involves building up segments of the actuation object, with each agent providing the word(s) or token(s) that its policies now know, by virtue of being in the delegation path, represent a proper interpretation of at least part of the user's intent. The resulting actuation object built up by the selected agents in the network is returned to the initiator agent as the output of the network. The initiator agent then typically forwards the actuation object to an Actuation agent, which evaluates the fields and field designators therein to issue a command or query to the back-end application and returns any response back to the user via the Interaction agent. In this way the intent of the user, as expressed in the input token string and interpreted by the agent network, is effected.
In order to provide a natural interaction it is desirable for natural language systems to support dialogues. In the past, dialoguing has been supported through the interpretation network of an AAOSA system as a hard-coded part of the interpretation phase of the network. In particular, the Interaction agent, after an input token sequence was interpreted by the agent network, would retain a copy of the winning claim. When new input arrived, if context was to be maintained, the Interaction agent would send the winning claim from the prior input into the interpretation network together with the new input. Then in the interpretation agents, whenever a policy made a claim, the agent class method would also retrieve from the previous winning claim the claims that were made by the policy in response to the previous input, and would re-make the same claims in addition to any new claims it could make on the new input. The claims re-made from the previous winning claim were treated as being of lower quality, so as to be superceded generally by any claims newly made on the new input. Thus consider an example system for a contact manager in which the agent network includes a First Name agent and a Last Name agent, both downchain of a Name agent. Dialogs such as the following were supported:
user:“Contact John”system:  First Name agent claims ‘John’.user:“Last name Smith”system:  Last Name agent claims ‘Smith’  First Name agent repeats its prior claim on ‘John’  Name agent combines the two claims to claim ‘John  Smith’
These approaches have been powerful, but also had a number of limitations. First, the repetition of prior claims was a hardcoded feature of the methods underlying the agent network. The designer of a given network had no control over the feature through the agent policy definitions. Not only did this limit flexibility, but also sometimes created undesirable behavior which had to be prevented through the use of additional policy conditions. Policies sometimes had to be written to re-interpret claims, including determining whether certain claims had been based on still-earlier input. In addition, in many systems part of the decision about what to include in the actuation object was made during the delegation phase, by the action part of policies, rather than in the interpretation phase. In that case the winning claim made in response to the prior input did not necessarily accurately reflect the actuation that actually resulted from the prior input. In some systems the Actuation agent, too, could modify the actuation in a manner that would not be reflected in the winning claim. Further, even though the entire prior winning claim was sent into the interpretation network with each new input token string, each agent made use of only its own prior claims. There was no easy way for an agent to refer to the prior claims made by other agents. Still further, the designer of the Actuation agent in some systems sometimes had to examine the winning claim object instead of only the actuation object representing the interpretation of the network. But since the winning claim object was a much more complex structure than the actuation object, this created a much greater knowledge burden on designers of Actuation agents than was desirable.
These limitations in the earlier dialoguing mechanisms made it difficult to support such features as replacing parameters of prior input, expressing dissatisfaction with the results of prior input, or expressly or implicitly discontinuing a dialog. A new approach to dialoguing is urgently required.