The present invention pertains to machine-implemented speech recognition. More particularly, the present invention relates to a tool for creating and editing grammars for machine-implemented speech recognition.
The use of speech recognition technology is rapidly becoming ubiquitous in everyday life. One application of speech recognition technology is in Interactive Voice Response (IVR) systems. IVR systems are commonly used to automate certain tasks that otherwise would be performed by a human being. More specifically, IVR systems are systems which create a dialog between a human speaker and a computer system to allow the computer system to perform a task on behalf of the speaker, to avoid the speaker or another human being having to perform the task. This operation generally involves the IVR system""s acquiring specific information from the speaker. IVR systems may be used to perform very simple tasks, such as allowing a consumer to select from several menu options over the telephone. Alternatively, IVR systems can be used to perform more sophisticated functions, such as allowing a consumer to perform banking or investment transactions over the telephone or to book flight reservations.
Current IVR systems typically are implemented by programming standard computer hardware with special-purpose software. In a basic IVR system, the software includes a speech recognition engine and a speech-enabled application (e.g., a telephone banking application) that is designed to use recognized speech output by the speech recognition engine. The hardware may include one or more conventional computer systems, such as personal computers (PCs), workstations, or other similar hardware. These computer systems may be configured by the software to operate in a client or server mode and may be connected to each other directly or on a network, such as a local area network (LAN). The IVR system also includes appropriate hardware and software for allowing audio data to be communicated to and from the speaker through an audio interface, such as a standard telephone connection.
The speech recognition engine recognizes speech from the speaker by comparing the speaker""s utterances to a set of xe2x80x9cgrammarsxe2x80x9d stored in a database. In this context, a grammar may be defined as a set of one or more words and/or phrases (xe2x80x9cexpressionsxe2x80x9d) that a speaker is expected or required to utter in response to a corresponding prompt, and the logical relationships between such expressions. The logical relationships include the expected or required temporal relationships between expressions, whether particular expressions are mandatory, optional, alternatives, etc. Hence, the speech recognition engine may use various different grammars, according to the type of information required by the speech-enabled application.
Defining the set of grammars for a particular IVR application can be time-consuming and difficult. Accordingly, it is desirable to have a tool which facilitates the creation and editing of speech recognition grammars.
The present invention includes a tool for allowing a user to create or edit grammars for speech recognition quickly and easily. An aspect of the present invention is a method and apparatus for providing a user interface, such that user inputs are received specifying a modification to a displayed grammar specification language (GSL) sequence. The displayed GSL sequence represents a grammar. In response to the user inputs, the displayed GSL sequence and data representing a set of displayable graphical objects representing the grammar are modified.
Another aspect of the present invention is a method and apparatus for providing a user interface for allowing a user to create and edit grammars for speech recognition, in which first user inputs that specify a first grammar for speech recognition are received. In response to the first user inputs, a first set of graphical objects representing the first grammar is generated and a corresponding first GSL sequence representing the first grammar is also generated. Second user inputs specifying a second GSL sequence representing a second grammar for speech recognition are also received. In response to the second user inputs, data representing a second set of graphical objects are generated, wherein the second of graphical objects represent the second grammar.
Yet another aspect of the present invention is a method and apparatus for allowing a user to create and edit grammars for speech recognition, including receiving first user inputs that specify a modification to a displayed set of graphical objects which represent a grammar. In response to the first user inputs, the displayed set of graphical objects and a GSL sequence textually representing the grammar are concurrently modified.
Still another aspect of the present invention is a method and apparatus for providing a user interface for allowing a user to edit grammars for speech recognition, such that the method includes operating in a first editing mode which allows the user to enter first inputs to specify a first grammar. In response to the first inputs, a first set of graphical objects and a corresponding first GSL sequence representing the first grammar are generated. The method further includes operating in a second editing mode for allowing the user to enter second inputs to specify a second GSL sequence. The second GSL sequence includes a second grammar, such that in response to the second inputs, a second set of graphical objects representing the second GSL sequence is generated.
Other features of the present invention will be apparent from the accompanying drawings and from the detailed description which follows.