In general, speech recognition applications allow users to interact with a system by using their voice. Typical command-and-control (C&C) speech applications allow users to interact with a system by speaking commands and/or asking questions restricted to fixed, grammar-containing pre-defined phrases. While speech recognition applications have been commonplace in telephony and accessibility systems for many years, only recently have mobile devices had the memory and processing capacity to support not only speech recognition, but a whole range of multimedia functionalities that can be controlled by speech.
Furthermore, the ultimate goal of the speech recognition technology is to be able to produce a system that can recognize with 100% accuracy all of the words that are spoken by any person. However, even after years of research in this area, the best speech recognition software applications still cannot recognize speech with 100% accuracy. For example, most commercial speech recognition applications utilize context-free grammars (CFGs) for C&C speech recognition. Typically, these grammars are authored to try to achieve broad coverage of utterances while remaining relatively small for faster performance. As such, some speech recognition applications are able to recognize over 90% of the words, when speakers produce utterances that fit within the constraints of the grammars.
Unfortunately, despite attempts to cover all possible utterances for different commands, users occasionally produce expressions that fall outside of the grammars (e.g., out-of-grammar (OOG) user utterances). For example, suppose the grammar is authored to anticipate the expression “What is my battery strength?” for reporting device power. If the user forgets that expression, or simply does not read the instructions, and utters “Please tell me my battery strength,” the speech recognizer will either produce a recognition result with very low confidence or no result at all. This can lead to the speech recognition application failing to complete the task on behalf of the user. Further, if users unknowingly believe and expect that the speech recognition application should recognize the utterance, the user would conclude that the speech recognition application is faulty or ineffective, and cease from using the product.
In general, due to the above-noted problems, a relatively long development cycle, which typically involves a complex and costly grammar-authoring process, is required to get an application that utilizes CFGs to a relatively high speech recognition accuracy level.
The discussion above is merely provided for general background information and is not intended to be used as an aid in determining the scope of the claimed subject matter.