1. Field of the Invention
The present invention is related to the field of speech communication, and, more particularly, to speech-based user interfaces.
2. Description of the Related Art
A user interface is the component that allows a user to input into a computer system the various instructions and/or data used by the computer system in carrying out any of a multitude of processing tasks. Various types of user interfaces presently exist, most being implemented through a combination of hardware or software. One of the most promising user interface types, both in terms of future uses and further development, is the speech-based user interface.
The speech-based user interface has dramatically expanded the range of uses to which computers can be put. The speech-based user interface can, in many circumstances, obviate any need for text-based input via a keyboard or graphical input via a graphical user interface (GUI). This enables a user to access a computer system without using a computer terminal, an occurrence common when interfacing with embedded devices, small mobile devices, and telephony-based voice-response systems. Consequently, the speech-based user interface has extended the reach of various types of computer systems by making them usable simply on the basis of ordinary spoken language. Indeed, as a result of the speech-based user interface, there is an expanding array of computer systems for conducting banking transactions, making airline reservations, and carrying out a host of other functions merely on the basis spoken commands.
Speech-based user interfaces typically rely on automated speech recognition for converting speech into machine-readable instructions and data. In general, speech recognition involves the transformation of acoustic signals into electronic signals that, in turn, are digitized for electronic processing by the speech recognition device. Regardless of the sophistication of the underlying technology, however, there is inevitably some risk inherent in any user-machine dialog via a speech-based user interface that a communication error will occur, just as there is in the case of human-to-human communications.
Accordingly, conventional speech-based user interfaces typically rely on techniques such as help messages and supplemental information provided, for example, by a text-to-speech (TTS) processor to guide a user through an automated, speech-only exchange or dialog with a computer system or speech recognition device. Often times, though, users of speech-based interfaces find these help messages and informational guides repetitive and unhelpful. Many speech-based user interfaces operate as though only two types of communication errors occur: either the user is unaware of the need for a user utterance at some point during an interactive dialog, which results in a time-out error, or secondly the user utterance given is unrecognizable to the user interface because it is not part of the user interface grammar.
In many contexts, however, communication errors are more complex, and the conventional assumption about the nature of the errors is thus overly simplistic. Conventional speech-based user interfaces thus tend not to address speech recognition communication errors in a manner comparable to the way such errors are recognized and handled in ordinary human-to-human conversations. Accordingly, the processes employed for error recovery with such speech-based user interfaces do not reflect the natural responses that would follow from a human-to-human miscommunication or communication error. This, in turn, can make user-machine dialogs via a speech-based user interface less pleasant and more difficult as compared with human-to-human communications.
Moreover, conventional error recovery processes also tend to lack the complexity needed to deductively determine the nature of a communication error in a dialog via a speech-based user interface. It follows, that these processes also typically fail to provide a targeted response to a communication error.