The exemplary embodiment relates to the field of dialog systems and finds particular application in connection with a system and method for expanding a dialog tree for learning a dialog system.
Spoken dialog systems (SDS) have recently become widely used in human-computer interfaces, especially for access to various public information systems. These system use a virtual agent to conduct a dialogue with a client using a dialog manager to predict the next utterance of the agent. Despite their widespread use, there are still a number of challenges that have slowed their development. Among these are the time and cost of building such systems and the lack of expertise and training data. Various methods for developing SDS have been proposed, including statistical learning approaches, such as Reinforcement Learning (RL) (R. S. Sutton, et al., “Reinforcement Learning: An Introduction,” MIT Press, 1998), and rule-based hand-coded methods (Steve J. Young, “Using POMDPs for dialog management,” 2006 IEEE ACL Spoken Language Technology Workshop, pp. 8-13, 2006). Statistical learning methods offer several advantages over rule-based approaches. These include a data-driven development cycle, provably optimal action policies, a precise mathematical model for action selection, possibilities for generalization to unseen states and automatic optimization of competing trade-offs in the objective function. However, a problem with statistical approaches is that they rely on the availability of a large quantity of data.
In cases when a fixed dataset is used for learning, the optimal policy can only be discovered when it is present within the data (Andrew Y. Ng, et al., “Algorithms for inverse reinforcement learning,” Proc. 17th Intl Conf. on Machine Learning, pp. 663-670, 2000). However, in a reinforcement learning setting, dialog datasets have often been seen as an opportunity to propose a dialog policy before deployment and then making improvements to it throughout the reinforcement learning process of exploitation and exploration (Craig Boutilier, et al., “Accelerating reinforcement learning through implicit imitation,” CoRR, abs/1106.0681, 2011, Verena Rieser, “Bootstrapping reinforcement learning-based dialogue strategies from Wizard-of-Oz data,” PhD thesis, Saarland University, 2008).
Another approach is to generate data automatically from prior knowledge, like generative grammars. A problem with this approach is that the interaction models built off-line using handcrafted conversational models are often poor approximations of the way humans actually interact with computers. To overcome some of these problems, a technique known as the Wizard of Oz method was introduced (see, e.g., J. F. Kelley, “An empirical methodology for writing user-friendly natural language computer applications,” Proc. ACM CHI '83 Conf. on Human Factors in Computing Systems, Intelligent Interfaces, pp. 193-196, 1983; N. M. Fraser, et al., “Simulating speech systems,” Computer Speech and Language, 5(1):81-99, 1991; “Handbook of Standards and Resources for Spoken Language Systems,” Daffyd Gibbon, et al., eds, Mouton de Gruyter, Berlin, 1997; Niels Ole Bernsen, “Designing Interactive Speech Systems: from First Ideas to User Testing,” Springer-Verlag, Berlin, 1998). The method takes its name from Frank Baum's story, “The Wonderful Wizard of Oz,” in which the wizard, in this case, a human, simulates a dialog system and collects data to be used for building a conversational model. The idea behind the method was that human simulation can be an efficient empirical method for developing user-friendly natural language applications by adopting a controlled and scenario based authoring approach of dialog generation.
For such a simulation to be as close as possible to the final system's behavior, a number of appropriate supporting tools are needed. In the ideal case, these tools should offer the possibility for the wizard to control all parts of a dialog system, specifically, speech recognition, semantic analysis, dialog management, domain knowledge base, natural language generation, and text-to-speech conversion (Sophie Rosset, et al., “Design strategies for spoken language dialog systems,” EUROSPEECH. ISCA, pp. 1535-1538, 1999). While publicly-available software for building dialog systems exits such as the CSLU Toolkit (Stephen Sutton, et al., “The CSLU toolkit: Rapid prototyping of spoken language systems, ACM Symp. on User Interface Software and Technology, pp. 85-86, 1997), none of these tools supports application of the Wizard of Oz technique.
Examples of finite-state based systems for designing and conducting such experiments include MDWOZ (Cosmin Munteanu, et al., “MDWOZ: A Wizard of Oz environment for dialog systems development,” LREC. European Language Resources Association, 2000) and SUEDE (Scott R. Klemmer, et al., “SUEDE: A Wizard of Oz prototyping tool for speech user interfaces,” Proc. 13th Annual Symp. on User Interface Software and Technology (UIST-00), pp. 1-10 (2000). MDWOZ features a distributed client-server architecture and includes modules for database access as well as visual graph drawing and inspection to formalize the dialog automata. SUEDE provides a GUI and a browser-like environment for running experiments, and an “analysis mode” in which the experimenter can easily access and review the collected data. A drawback of these systems, however, is that they only allow finite-state dialogue modeling, which is restricted in its expressiveness.
One proposed method, in the context of tutoring dialog model learning, uses a mechanism of “progressive refinement” but without formal definition of the process (Armin Fiedler, et al., “Supporting progressive refinement of Wizard-of-Oz experiments,” 6th Intl Conf. on Intelligent Tutoring Systems, Workshop on Empirical Methods for Tutorial Dialogue, pp. 62-69, 2002). More recently, web-based platforms for performing Wizard of Oz, named WebOz, have been proposed in order to simplify the distribution of the workflow of annotation (Stephan Schlögl, et al., “WebWOZ: a Wizard of Oz prototyping framework,” EICS, pp. 109-114, ACM, 2010). However, no mechanism of active or reactive learning of dialog-tree expansion in the context of Wizard of Oz experiments has been suggested.
There remains a need for an Active Wizard of Oz method which is able to support efficient and task-oriented dialog experiments for producing usable data for dialog policy learning independently of a chosen policy learning model.