The present invention generally pertains to a development framework that enables a developer to efficiently mix different types of dialog within a given application. More particularly, the present invention pertains to the development of applications that incorporate both semantics-driven and state-driven dialog.
Applications that support user interaction through a voice user interface (VUI) are well known in the art. During the development process, these types of applications can be authored on top of a low level application program interface (API) framework that provides access to basic resources. For example, it is known for a telephony application to be authored on top of a low level API framework that includes support for resources such as, but not necessarily limited to, a telephony infrastructure, speech recognition resources, and speech synthesis resources.
From the perspective of an application developer, it is common that the process of authoring code that directly targets the described low level API resources is relatively tedious and labor intensive. Higher level constructs are known to provide a more intuitive interface to the low level resources. In some cases, higher level constructs have been utilized as a basis for creation of a dialog authoring model in the form of an API framework that serves as an interface to the low level API resources, thereby enabling a simplification of the generation of application code. The objects included in the higher level API framework have been configured to support a variety of different development experiences.
The result of the development process is generation of an application that facilitates user-system dialog in one of several different possible formats. Some dialog will be system-driven (or system-initiative) dialog. In one example of this type of dialog, a user interfacing a telephony application is presented with a spoken statement in the form of “welcome to my support application, please enter your product identification number.” In this case, no action is generally taken until the requested task is complete (i.e., a valid product identification number is entered). The system requires particular information, sometimes in a particular format. Thus, system-driven dialog is generally very constrained.
Some dialog will be user-driven (or user-initiative) dialog. In one example of this type of dialog, a user interfacing through a telephony application is presented with a spoken statement in the form of “welcome to my support application, how may I help you?” In response to this type of statement, the user can generally say anything, such as “I am having trouble with my machine” or “I want to return a product.” The system is then configured to identify the nature of the user's inquiry and respond accordingly, for example, “do you have a receipt?” The system determines what the key pieces of information are within the user's inquiry and then responds accordingly.
A development framework that supports semantics-driven dialog is generally more user-driven than system-driven. When authoring a section of semantics-driven dialog, a developer will generally specify which of a plurality of fields are to be filled in by obtaining appropriate information from the system user. In some ways, the semantics-driven format is similar to a form in a Graphical User Interface (GUI) application having certain fields to be filled in by the user. Instead of specifying a predetermined path through the fields (A→B→C, etc.), certain dialog nodes or elements are specified to react depending on the particular state of other fields. For example, a given dialog node A is specified to be active if field C is empty. Multiple dependencies are also possible, for example, a given dialog node is specified as active if fields A, B and C are empty but field E is filled and confirmed. Some fields can be set to require confirmation with the system user that their content is accurate. Following every user-machine interaction, a determination is made within the semantics-driven dialog framework as to which dialog node or nodes should be active next.
A development framework that supports state-driven dialog is generally more system-driven than user-driven. Interaction flow within a state-driven dialog process is more predetermined than with semantics-driven dialog interactions. Decisions generally follow a predetermined path from one element to the next. For example, a request is made for a first particular item of information. In response, information is received from the user. An evaluation is made as to whether the received information is worthy of confidence. If not, a confirmation process is carried out. If so, then the system requests a predetermined second item of information.
In state-driven dialog, there generally is no way for a user to advance more information than what is presently being asked for by the system. At every step, the system generally decides what is going to be done next. It is common for developers to graphically represent state-driven dialog in the form of a flow chart. Unlike semantics-driven dialog, the dialog does not jump around depending on what the user provides as input.
The higher level API framework described above as providing an interface to the low level API resources can be configured to primarily support semantics-driven dialog. This enables a developer to author very flexible and natural dialogs. A disadvantage with such a configuration is that simple, system-driven dialog authoring becomes a relatively difficult undertaking.
The higher level API can alternatively be configured to primarily support state-driven dialog. It now becomes easy to link dialog states with a condition (e.g., once you are finished with state A, you evaluate which condition is true and follow that path to the next state). This type of dialog development is easy to visualize and author. A disadvantage; however, is that the resulting dialog is neither natural nor flexible for the user of the application.