Typical speech recognition applications (e.g., command-and-control (C&C) speech recognition) allow users to interact with a system by speaking commands and/or asking questions restricted to fixed, grammar-containing pre-defined phrases. While speech recognition applications have been commonplace in telephony and accessibility systems for many years, only recently have mobile devices had the memory and processing capacity to support not only speech recognition, but a whole range of multimedia functionalities that can be controlled by speech.
Furthermore, the ultimate goal of the speech recognition technology is to be able to produce a system that can recognize with 100% accuracy all of the words that are spoken by any person. However, even after years of research in this area, the best speech recognition software applications still cannot recognize speech with 100% accuracy. For example, most commercial speech recognition applications utilize context-free grammars for C&C speech recognition. Typically, these grammars are authored such that they achieve broad coverage of utterances while remaining relatively small for faster performance. As such, some speech recognition applications are able to recognize over 90% of the words, when spoken under specific constraints regarding content and/or acoustic training has been performed to recognize the speaker's speech characteristics.
Unfortunately, despite attempts to cover all possible utterances for different commands, users occasionally produce expressions that fall outside of the grammars (e.g., out-of-grammar (OOG) user utterances). For example, if a user forgets the expression for battery strength, or simply does not read the instructions, and utters an OOG utterance, the speech recognition application will often either produce a recognition result with very low confidence or no result at all. This can lead to the speech recognition application failing to complete the task on behalf of the user. Further, if users unknowingly believe and expect that the speech recognition application should recognize the utterance, the user would conclude that the speech recognition application is faulty or ineffective, and cease from using the product.