The present invention relates to methods and systems for recognizing acoustic utterances, and more particularly, to generating alternate versions of a decoded utterance.
Handheld electronic devices (e.g., mobile phones, PDAs, etc., referred to herein as “handhelds”) typically provide for user input via a keypad or similar interface, through which the user manually enters commands and/or alphanumeric data. Manually entering information may require the user to divert his attention from potentially critical activities (e.g., driving). One solution has been to equip the handheld with an embedded speech recognizer. In some cases, the speech recognizer may recognize only a constrained data set, such as a finite list of names or phone numbers; in other cases it may be able to recognize speech without constraint, for example in a dictation mode limited only by a set of valid words and rules of grammar.
Due to environmental noise, limitations in the handheld's audio receiver, and more significantly, due to limitations in computing power of the handheld, the speech recognizer may occasionally incorrectly decode the utterance from the user. To deal with such errors, some speech recognizers generate a list of N alternatives for the decoded utterance, referred to herein as the choice list (also known in the art as an N-best list), from which the user may choose the correct version.
Although there are several known techniques for creating a list of alternative utterances, such techniques tend to work best on platforms with significant processing power and relatively large amounts of available memory. Implementing these techniques on hand held platforms having limited memory resources and processing power reduces the effectiveness of the techniques.