This invention relates to an interaction assistance system, and in particular to an automated assistant for a user interacting with a system using speech.
Previous automated dialog systems have been based on hand-constructed slot-filling applications. These are normally hand-tuned, and accept only a subset of the English language as input (this tends to make them difficult to use, and very hard to learn). Some such systems support mixed initiative, a mode in which machines collect additional information about the conversation from the user. More recently, Partially-Observable Markov Decision Process (POMDP) approaches have used partially hidden Markov processes to keep track of the state of the system, where the system keeps track of multiple states at each time, and the system acts on a best guess at each time. In such prior systems, the semantics of the processes have been hand coded, or encoded as a simple probabilistic process if the dialog is simple enough. Semantics are tied to meanings or actions of words and/or context.
In the area of telephone-based assistants, previous telephone assistants were not in general dialog agents, but were instead single utterance command/response systems. In a number of systems, the user can request either a piece of information or an action, and the system responds appropriately if the speech recognizer had been accurate and if the user had uttered a request from within the vocabulary of the system. However, in general, the systems were brittle, did not understand paraphrase, and did not carry context across sessions, and mostly did not carry context even within an interaction session.