Many techniques currently exist for specifying dialogue control logic in voice and multimodal dialogue systems. At the lowest level of abstraction are finite state machines (FSMs), which explicitly enumerate the various states and transitions in a dialog flow. FSMs have frequently been used in as a technique for specifying dialog flows. Recently, proposals have been made to use Harel Statecharts (Statecharts), also known as Hierarchical State Machines (HSMs), as a generic control language for specifying user interaction control logic. Statecharts are similar to FSMs, but they are augmented with a variety of additional constructs, including hierarchical states, guard conditions, and parallel states. These added constructs can make Statecharts simpler and more extensible than equivalent FSMs, because they factor out common behavior into common super-states, eliminating duplication of logic.
At a higher level of abstraction than FSMs and Statecharts are frame-based techniques for dialog management. In many task-based dialogs, a system requires certain pieces of information from the user in order to accomplish some domain specific task (such as booking a flight, finding a restaurant, or finding out who you want to call on the phone). A frame is a data structure that can hold the required information. A primary advantage of frame-based techniques over finite-state scripts is that they enable a dialog designer to create a relatively complex dialog in a more compact format. A frame succinctly represents a large number of states by eliminating much of the explicit process logic that is required in FSMs and Statecharts. Another advantage of frame-based techniques is that it is easier to model mixed-initiative dialog, because it is easy to specify grammars, prompts, and actions that have scope over multiple fields contained within a form. A primary reason for the current popularity of frames is the existence of standards, such as the World Wide Web Consortium Voice Extensible Markup Language version 2.0 (W3C VoiceXML 2.0) standard, which adopt the frame-based approach to dialog specification. In the VoiceXML 2.0 standard, frames come in two varieties, “forms”, and “menus”. An example of a VoiceXML frame is shown in FIG. 1. An example of a frame written in the frame-specification language Motorola Portable Dialog Frame Language (MPD-FL) of the Motorola Corporation is shown in FIG. 2.
While there are some advantages to frame-based techniques for dialog specification, there are some disadvantages as well. These stem from the fact that frame-based dialog managers require built-in algorithms for interpreting frames, since the frame itself is a primarily declarative structure that omits most of the process control logic required to use it in a dialog. In VoiceXML, this built-in algorithm is called the “Form Interpretation Algorithm” (FIA). In this document, the term “FIA” is used as a generic term for any algorithm that reads in a frame and generates a corresponding dialog flow. This reliance on an FIA leads to two sorts of problems: Firstly, it can be hard to verify and debug a frame, since it isn't easy to visualize the current state and the current possibilities for transitioning to other states. Secondly, if the dialog designer wants to create a dialog that doesn't fit well with the built-in FIA, then he or she must struggle against the constraints of the framework in order to implement the desired logic.
While the first problem could be remedied to some degree with proper visualization tools, the second problem is intrinsic to the use of frames. It remains the case that frame-based techniques are suitable for some kinds of dialogs (those fitting well with the “form-filling” or “menu selection” metaphors) but not for many other types of dialog. This has prompted designers to look at the use of Statecharts as a control language for the future VoiceXML 3.0 standard. This new control language has been termed Statechart XML (SCXML) by the W3C Working Group, which plans on using it to augment the frame-language defined in VoiceXML 2.0.
The technical details of Statecharts are known in the art and use of Statecharts with dialog systems has already been proposed. However, prior publications do not describe how to generate the Statecharts from higher-level dialog abstractions. The generation of deterministic Statecharts from feature models has been disclosed, but features models are quite different from frames and frame constructs. Feature models are more like domain models than compact descriptions of possible dialog moves, such as forms and menus.
Statecharts have also be used as a starting point for generating software, but automatic generation of the Statecharts themselves from other constructs has not been proposed. In addition, the use of declarative constructs (other than frame constructs) to generate simple state machines has been disclosed, but these do not generate Statecharts.