Speech recognition applications have been commonplace in telephony and accessibility systems for many years, however only recently have mobile devices had the memory and processing capacity to support not only speech recognition, but a whole range of multimedia functionalities that could be controlled by speech.
Furthermore, the ultimate goal of the speech recognition (or dialog) technology is to be able to produce a system that can recognize with 100% accuracy all words that are spoken by any person. However, even after years of research in this area, the best speech recognition software applications still cannot recognize speech with 100% accuracy. For example, some applications are able to recognize over 90% of the words when spoken under specific constraints regarding content and previous training to recognize the speaker's speech characteristics, while others recognize a significantly lower percentage. Accordingly, statistical models that can predict commands based in part on past user behavior, have been developed to function in combination with the speech recognition application to improve the accuracy of speech recognition. These statistical models can be used in combination with user speech commands to improve dialog performance of the speech recognition applications.
Unfortunately, oftentimes the results of the speech commands and the predictive statistical models can differ. Discrepancies can occur between the speech command results and the statistical model results when the statistical model predicts one goal (or intended result) and the speech command predicts a different goal. When this situation arises, it may be advantageous for a speech recognition application to engage in a dialog repair process so as to learn which result is more reliable.