Interactive voice response (IVR) systems have been in use for some time now. Typically, such systems operate such that a calling (or called) user is asked a series of questions and is prompted to give a response. At first, these responses were accomplished by the user touching a keypad number. In such systems the calling (or called) user would be prompted as follows: “Please touch one for today's flights and touch two for all other flight information.”
More recent system allow the user to answer verbally. Thus the user prompts for such systems would be: “Please say one for today's flights and say two for all other flight information.” An alternate prompt would be: “Please say yes if you are flying today and no if you are flying any other day.” Another alternate prompt would be: “Say the time you would like to fly.” The user would be expected to respond with, “this morning” or “five p.m.”
The designer of such systems must code each application such that it follows a specific script, or call flow. Tools, including graphical tools using icons, are typically used for such application call flow coding. One example of such a tool is shown in U.S. Pat. No. 5,946,485, dated Aug. 31, 1999; and U.S. Pat. No. 6,131,184, dated Oct. 10, 2000, both of which are incorporated by reference herein.
In such prior art call flow tools, icons are used to illustrate for the designer the pertinent details of the call flow so that the designer could rearrange the call flow, or insert other paths or options into the call flow. Thus, in the prior art there is a single icon such as icon 80, shown in FIG. 8, that a user places in the call flow that represents the entire event recognition call flow. The designer could supply several parameters that are used in defining the particular event to be recognized. However, in order to change the event recognition format, a designer would have to add code to vary the structure or process which is to be followed by the standard event handled icon. In the prior art, a single icon represents a single recognition event including all actions leading to resolution of that recognition event.
Recently, IVR systems have begun to incorporate more complex caller voice recognition events, so that the caller might now hear the following prompt: “Please tell me your flight date and destination city.” These more complex types of recognition events are more difficult to program and to represent by a single icon.
Caller (talker) directed systems rely on the recognition of various responses from the calling (or called) user and can be as free-flowing as desired. Thus, a prompt could be: “Please tell me what I can do for you today.” A more typical prompt would be more specific, such as: “Please tell me what day you are flying and what flight number you are asking about.”
With the current state of the art, the application designer would code each of these scenarios to respond to the talker's answers. Speech recognition is then used to determine what the talker has responded. The graphical icon application tools do not work well for speech recognition applications. Today in the industry, a recognition event is handled by defining everything inside a single icon. All events that control a recognition event are packaged into a single icon. Although these icon tools exist today to provide macro level directed dialogue snipped graphically, the user does not have control to vary or supplement those singular events, except through extensive supplemental coding.
The call flow in speech recognition applications relies on the generation and ultimately the recognition of certain grammars. Each grammar is a collection of phrases that are passed to a system component. The system component then “listens” to the user input to determine if the user spoke one of the defined phrases. If the user speaks one of those phrases, that phrase is passed back to the application for subsequent processing within the call flow. However, the calling (or called) user could respond with a word or phrase which is out of context. Or in multiple response situations (such as “what day and time are you flying?”) the system must know and process both responses before the next step is achieved. Establishing the code and call flow processing for situations such as this, is difficult and time consuming, and would have to be repeated for each application and for any changes required in an application.
In the state of the art today, the user can code for the return of specific words or responses. FIGS. 9A and 9B show such coding for a simple situation for single-slot returns and multi-slot returns, respectively. As shown in FIGS. 9A and 9B, there are carats surrounding code words, and the designer must parse through the code in order to understand the operation of the call flow so that desired changes can be made. This coding is graphically cumbersome and the applications become formidable. In existing graphic packages, the prompts which hold onto the grammar definition, including the prompts, the timers, the possible behavioral responses, are woven tightly into what is called the ‘tool kit,’ and graphical constructs to represent such alternative coding is not available.