In many software applications, statistical classifiers are used to predict potential outputs. A statistical classifier assigns a probability distribution on all potential outputs. The system can select the top n outputs with highest probabilities. This is called n-best selection method, which has been used in speech recognition, natural language understanding, machine translation and other applications. Traditionally n is a fixed number.
Dialog systems are systems in which a person speaks or otherwise enters input to a computer in natural language, in order to accomplish a result. With the rise of microprocessor-controlled appliances and equipment, dialog systems are increasingly used to facilitate the man-machine interface in many applications such as computers, automobiles, home appliances, phone-based customer service, and so on. Dialog systems process the query and access one or more databases to retrieve responses to the query. Dialog systems may also perform other actions based on the request from the user. In order to provide meaningful results with as little user interaction as possible, dialog systems should be designed and implemented to accommodate large variations in the content and format of the queries, as well as the content and format of the responsive data.
Typically, a dialog system includes several modules or components, including a language understanding module, a dialog management module, and a response generation module. In the case of spoken dialog systems, a speech recognition module and a text-to-speech module are included. Each module may include some number of sub-modules. When statistical approaches are used in one or many of these modules, multiple result candidates may be produced. When multiple candidates are produced in conventional systems, the number of candidates is fixed as one of the static parameters.
A persistent issue in modern dialog systems is coverage and the fact that they rely on static rules, data structures and/or data content to process and return responses to user queries. Regardless of how comprehensive a dialog system is, it can never exhaust all the possibilities that people speak. To build a robust system, there is a need for dialog systems that include built-in adaptive components that can be easily trained and updated as new data are collected. Consequently, there is a need for a dialog system that can dynamically store utterances the system does not understand, and use data of these stored utterances to subsequently re-train the system. This eliminates the wasteful effort of training the system on data it already understands.