Response generation systems, also known as dialog systems or conversational agents, are becoming increasingly ubiquitous in a variety of computing systems and devices. Response generation systems are designed to interpret natural language input messages received from users and output natural language responses to the users. Current dialog systems are typically rule-based systems utilizing hand-scripted dialog and relying upon statistical algorithms to track state between each step in a conversation. Many components of these dialog systems remain hand-coded. In particular, these systems rely upon labels and attributes defining dialog states.
However, these rule-based systems generally only take into account the user input message and response. These systems do not lend themselves to incorporation of information about preceding conversational context.
Moreover, current response generation systems usually separate dialog management and response generation. Due to the limitations of these systems, the responses output by these systems are often irrelevant, inappropriate, or lacking pertinence to the user input message and/or the conversation. Moreover, these systems are not very robust: they do not adapt well to new domains and they do not scale.
One alternative to rule-based systems that has been proposed is systems that borrow from machine translation techniques by attempting to map phrases in an input sentence to phrases in a lattice of possible outputs. Machine translation may also be referred to as automated language translation. These systems use phrase table lookup to provide the mappings. However, attempting to add contextual information to these machine translation-based systems results in increased sparsity and skew in the phrase table that stores mappings between messages and responses. In other words, injection of context information into these machine translation models results in unmanageable growth of the phrase table at the cost of increased sparsity and skew towards rarely-seen context pairs. In addition in many current statistical approaches to machine translation, phrase pairs do not share statistical weights regardless of their intrinsic semantic commonality