1. Technical Field
This invention relates to the field of voice response systems and more particularly to a method and system for voice data entry recognition in a voice response system.
2. Description of the Related Art
In data information systems in which forms are employed with which a user can supply data to the system, often a field, or series of fields can be completed by the user. In such data information systems, users can supply data for each field in the form. However, data fields can be restricted with regard to the data which can be supplied therein. For instance, in a data information system for collecting user contact information, a form can restrict data supplied in the city and state fields to those cities or states which are available (have associated data) in the data information system. As an example, while Miami, Fla. might be an available city and state pair in the system, the data information system might not have any information about the city and state pair Sunny Ga. making Sunny Ga. unavailable in the system.
In visually interactive data information systems employing visual interfaces, data to be supplied in a field in a form can be restricted through the use of a corresponding list box. In a list box, users can be presented with a predefined list of data entries acceptable for input in a corresponding field. Still, the use of a list box in an audibly interactive data information system employing an audio user interface can prove tedious at best. First, in an audio user interface, for each list box corresponding to a field in a form, the audio user interface must audibly playback the acceptable data entries in the list box until the user selects one of the acceptable data entries. As an alternative, the user can memorize each available data entry in the list box prior to audibly supplying an available data entry.
Second, unlike the case of a visual user interface, in an audio user interface, the problem of data entry availability can be compounded with the problem of speech recognition. In particular, in the case of a visual user interface, the user can select an available data entry with a mouse-click or by typing the acceptable data entry. In either case, the user-supplied data entry is unmistakable. In contrast, in the case of an audio user interface, in addition to surmounting the data availability process, the user-supplied data must surmount recognition problems associated with the speech recognition process. More specifically, anything that is to be audibly supplied to a field in a form through an audio user interface not only must be considered an available entry from the perspective of the data information system, but also must be considered a speech recognizable entry from the perspective of the speech recognition engine.
For example, in a voice response system, each word supplied as a voice response must exist in a speech recognition grammar in order to successfully undergo a speech-to-text conversion process. If a user utters a word not contained in the speech recognition grammar, an Out of Grammar (hereinafter xe2x80x9cOOGxe2x80x9d) condition can arise. Typically, a voice response system can respond to an OOG condition by xe2x80x9cthrowingxe2x80x9d an OOG exception. When an OOG exception is thrown, a voice response system can only inform the user that the voice response provided to the voice response system was not understood (because it was not located in the speech recognition grammar).
The circumstance in which a user interacts with a voice response system for providing information regarding particular cities is an example of this problem. When prompted by the voice response system to provide the name of a city for which the voice response system can provide information, a user can utter, xe2x80x9cSunnyxe2x80x9d as in Sunny Ga. Preferably, if the voice response system does not contain information Sunny Ga. the user should be notified, xe2x80x9cThere is no information on Sunny, Ga.xe2x80x9d. However, if Georgia is not included in the speech recognition grammar, when the user utters Sunny Ga., the voice response system will throw an OOG exception and the data information system will respond with, xe2x80x9cI did not understand what you said.xe2x80x9d Consequently, the voice response system cannot indicate to the user that Sunny Ga. is not an available city/state pair in the data store of the voice response system because the voice response system never successfully speech recognized the user voice input xe2x80x9cSunny Ga.xe2x80x9d in the first place. Thus, there exists a need for a voice response system in which words not contained in the data stores of the voice response system are nonetheless recognized by the voice response system so that the voice response system can report the same to the user.
The present invention is a voice response system in which words not contained in the data stores of the voice response system are nonetheless recognized by the voice response system so that the voice response system can report xe2x80x9cNo information on . . .xe2x80x9d rather than reporting an OOG exception. The present invention solves the problem of the OOG condition by overloading the speech recognition grammar with word data entries which may or may not exist in the voice response system data stores. In consequence, voice responses which a user might speak are at least recognizable and xe2x80x9cactionablexe2x80x9d by the voice response system, even though the voice response may not be an available response in the data store. In the above-described example, a user can provide the voice response, xe2x80x9cSunny Ga.xe2x80x9d and receive in return from the voice response system, xe2x80x9cThere is no information on Sunny Ga.xe2x80x9d In contrast, a user can provide the voice response, xe2x80x9cIskabibblexe2x80x9d which correctly can cause an OOG condition. In response, the user can receive from the voice response system, xe2x80x9cI did not understand what you saidxe2x80x9d. Hence, the present invention alleviates the OOG condition which would otherwise create a bad usability problem.
A method for voice data entry availability in a voice response system can include establishing a data set of words relating to a data information system; including the data set of words in a speech grammar for use with a speech recognition engine; and including a subset of the data set in a data store, wherein the subset has words used by the data information system, and the subset does not have words in the data set which are not used by the data information system. Subsequently, speech queries can be received which specify data. The speech queries can be received through an audio user interface to the data information system. Speech-to-text conversion can be performed on the speech queries using the speech recognition engine.
If the specified data is in the data set and if the specified data also is in the subset, the speech queries can be processed with the specified data. However, if the specified data is in the data set, but the specified data is not in the subset, the specified data is reported not to be in the subset. Finally, if the specified data is not in the data set, it is reported that the specified data cannot be speech-to-text converted. Furthermore, the speech query is not processed.
In the preferred embodiment, the step of reporting that the specified data cannot be speech-to-text converted can include throwing an Out-Of-Grammar (OOG) exception. Additionally, the step of receiving speech queries through an audio user interface to the data information system can include receiving speech queries in the voice response system telephonically. Specifically, the speech queries can originate through a telephone handset. Subsequently, the speech queries can be transmitted no through a telephone data network and received in the voice response system through a telephone data network interface in the voice response system. Finally, the speech queries can be communicated from the telephone data network interface to the audio user interface.
Alternatively, the speech queries can originate in a kiosk, through a personal digital assistant, a personal computer, or any other suitable platform for providing audio input to a computer speech recognition system. Notably, the audio user interface can be a Voice Browser to a Web-enabled data information system, wherein the Voice Browser enables voice operation of the Web-enabled data information system.
A method for voice data entry availability in a voice response system can also include receiving speech input specifying data in an audio user interface to a data information system for processing data in a data store. The speech input can be received through an audio user interface to the data information system. Subsequently, speech-to-text conversion of the speech input can be performed using a speech recognition engine with reference to a corresponding speech grammar. In particular, the speech grammar can contain a data set of words relating to the data information system. Notably, the data store can contain a subset of the data set, the subset having words which can be processed by the data information system, the subset not having words which cannot be processed by the data information system.
If the specified data is included in the speech grammar and if the specified data is in the data store, the speech data in the speech query can be processed. However, if the specified data is not in the data store, it can be reported that the specified data cannot be processed. Finally, if the specified data is not included in the speech grammar, an Out-Of-Grammar (OOG) condition can be reported. Additionally, the speech data in the speech query is not processed.
In the preferred embodiment, the step of reporting an OOG condition can include throwing an OOG exception. Additionally, the step of receiving speech input in an audio user interface to the data information system can include receiving speech input in the voice response system telephonically. Specifically, the speech input can originate through a telephone handset. Subsequently, the speech input can be transmitted through a telephone data network and the speech input can be received in the voice response system through a telephone data network interface in the voice response system. Finally, the speech input can be communicated from the telephone data network interface to the audio user interface. Notably, the audio user interface can be a Voice Browser to a Web-enabled data information system, the Voice Browser enabling voice operation of the Web-enabled data information system.