1. Field of the Invention
The invention relates to the field of dialogue-based systems and, more particularly, to dialogue-based systems that incorporate natural language understanding technology.
2. Description of the Related Art
Dialogue systems serve as interfaces through which users can query an information processing system for information or direct an information processing system to perform an action. One variety of dialogue system is a directed dialogue system. A directed dialogue system controls the flow of dialogue with a user through a structured set of menu choices. The user is prompted for a single item of information at each point within the menu hierarchy. The user responds to each prompt, providing responses in serial fashion until the dialogue system obtains the information needed to respond to the user's query or perform the user-specified action. Within directed dialogue systems, the user must proceed through the menu hierarchy, providing requested information in the order in which the dialogue system asks, without deviation.
Another type of dialogue system, which can be referred to as a natural language (or free-form) mixed-initiative dialogue system, a conversational dialogue system, or a conversational natural language system, allows the caller to take control of the dialogue at any point if desired. Otherwise, the dialogue flows in a pre-defined manner with the system prompting the user for one item of information at a time. Control of the dialogue can alternate between the user and the mixed-initiative system, hence the name. A mixed-initiative system also allows a user to phrase a request in a natural manner, providing as much or as little information as the user wishes in the same way the user would interact with a human operator. As a result, a transaction that would take several turns in a directed dialogue system can potentially be completed in a single turn.
A mixed-initiative system typically starts a dialogue by asking the user an open ended question such as “How may I help you?” This starting point of the dialogue is referred to as the Main Menu state. When in the Main Menu state, the user can take control of the dialogue and issue a request for anything within the scope of the information processing system. The following are examples of such user initiated requests in the context of the Main Menu state:
System:How may I help you?User:I'd like to transfer $4000.00 to the Growth fund.System:How may I help you?User:What is my balance in the Growth fund?
When processing a user-initiated request, the mixed-initiative system needs to determine the task or action that the user has requested, in this case transferring money or determining the balance of an account. The mixed-initiative system must also identify any tokens of information that the user has provided with the request, such as the amount to be transferred, the account, or fund for which the balance is to be determined. The mixed-initiative system performs the action if all tokens required for the action have been determined.
If further tokens of information are necessary before the action can be performed, for example the source from which money will be transferred, the mixed-initiative system enters a token gathering state. In the token gathering state, the mixed-initiative system takes control of the dialogue by asking for any missing or ambiguous tokens of information through a series of directed, system-initiated prompts. The system can prompt for one or more tokens of information at that time. When in the token gathering state, the user often responds directly to the system-initiated prompts, providing the token(s) of information that were requested. The following are examples of such system-initiated or context dependent responses in the token gathering state:
System:From which fund would you like to transfer $4,000.00?User:From the Fixed Income fund.System:Confirming your request to transfer $4,000.00 from the FixedIncome fund to the Growth fund. Is this correct?User:Yes.System:From which airport in New York do you want to pick upthe car?User:LaGuardia.System:On what date will you return the car?User:A week from next Tuesday.
Within a mixed-initiative application, the user may choose to take control of the dialogue at any point in time. When in the token gathering state, the user may choose not to respond directly to a system-initiated prompt, and instead, respond out of context, issuing a new user-initiated request for anything within the scope of the information processing system. The following dialogue is an example of such a situation.
System:From which fund would you like to transfer $4,000.00?User:How much do I have in the Fixed Income fund?System:You have $6,700.00 in the Fixed Income fund. Continuing withyour fund transfer, from which fund would you like to transfer$4,000.00?User:What's my balance in the Large Cap fund?System:You have $3,700.00 in the Large Cap fund. Continuing you'reyour fund transfer, from which fund would you liketo transfer $4,000.00?User:From the Fixed Income fund.System:Confirming your request to transfer $4,000.00 from the FixedIncome fund to the Growth fund. Is this correct?User:Yes.Notably, despite the two requests for balance information being asked in the middle of the fund transfer action, the requests are treated as valid requests that could have been asked in the main menu state. Therefore, main menu-like user input received in the token gathering state is another source of user-initiated requests. The following is another example of a user-initiated request received in the token gathering state:
System:From which city would you like to depart?User:I would like to speak to an operator.System:Please hold while your call is transferred.In summary, a mixed-initiative system needs to interpret user-initiated requests at the main menu and interpret token input (system-initiated responses) or user-initiated requests (context independent requests) in the token gathering state.
Conventional mixed-initiative systems attempt to process all user input, that is both user-initiated requests and system-initiated responses, using a single natural language understanding (NLU) module. This NLU module can be implemented as a statistical parser built by manually annotating training data with a set of markers representing the meaning of the sentence. Using a single NLU module in this manner, however, can be problematic.
In particular, when using a single NLU module, the module must be able to interpret user-initiated requests even when the mixed-initiative system is in the token gathering state. In consequence, a complex set of annotation markers is required to coherently represent the meaning of user requests under different contexts. Direct responses to system-initiated prompts also must be represented using the same complex annotation style. This increases annotation time and the skill required to annotate the training corpus.
Additionally, a large amount of training data is required to adequately represent the same user-initiated request under multiple contexts. This increases the amount of time required to annotate the training data and train the NLU module. Further, in the token gathering state, the system is expected to learn to use context information when processing token inputs and ignore context information when processing action requests. This conflict impacts the accuracy of the NLU module, as well as the overall mixed-initiative system, causing mistakes when interpreting simple direct responses in the token gathering state.
Further, the use of a single NLU module necessitates the use of the same annotation style to interpret both user-initiated requests and system-initiated responses. This requires the NLU module to identify the action from the input, even in cases where the user is only specifying a token. Quite often, this can result in an action that does not match the specified context.