1. Field of the Invention
The present invention generally relates to speech recognition and, more particularly, to a system for replaying a recognized utterance, without actually recording the utterance, so that a user can confirm that the utterance was properly recognized, useful in, for example, hands-free voice telephone dialing applications.
2. Description of the Related Art
Pattern recognition, and particularly speech recognition is a desirable way in which to input data and commands into systems which traditionally relied on keyboards, keypads, or other "hands-on" devices. Keyboards, while the current standard to input data, are slow and prone to error subject to the skill and expertise of the typist. In some modern day applications, traditional keypads can even prove hazardous. For example, many cars are equipped with cellular or personal telephones which allow the driver to carry on telephone conversations while driving. However, it is usually dangerous and not particularly prudent for the driver to avert his attention from the road to dial a telephone number into the telephone keypad.
Voice dialing is an option on some portable telephones. In operation, the caller speaks a name of a person to be dialed or may speak aloud the phone number itself of the person to be dialed. Voice recognition circuitry processes the utterance and dials the telephone accordingly thereby avoiding the need for the user manually enter the number into the keypad.
Modern voice recognition circuitry does not fair well at correctly recognizing spoken numbers nor does it fair well with spoken name recognition since there are an infinite number of possibilities, many of which sound alike. This problem leads to frequent wrong numbers and unnecessarily increased cellular phone bills, not to mention user frustration.
Further, in modern voice synthesis circuitry, a recorded version of the speech is usually compressed using a speech coding algorithm. The compressed speech is used to provide playback of the word spoken. Due to space considerations, storing whole recorded versions of speech becomes impractical as the size of the vocabulary increases.