When voice recognition is used as an intelligent text input method, i.e., when speech input is converted to text, e.g., in a mobile terminal, an incorrectly recognized word in speech input text is correctable in post-processing. Specifically, speech input by a user is recognized and then the recognition result is examined to detect a word that may be incorrectly recognized. A correction mode is then provided for the user to correct the detected word that may be incorrectly recognized.
Currently, to detect an incorrectly-recognized word, a confidence degree between the input speech and the recognition result is measured through a confidence measure, and when a word with a low confidence degree is detected, that word is identified as possibly being recognized incorrectly. To correct an incorrectly recognized word, a correction mode is provided for a user to re-input the correct word through a number of different methods. For example, the user may choose a correct word from a local communication log, may re-choose a correct word from candidate words with similar pronunciations to the incorrectly-recognized word, may re-input speech through re-speaking, may enter a correct word through handwriting recognition, may directly enter a correct word through keyboard, etc. The word re-input by the user is used for correction.
However, this conventional type of correction mode has a number of defects.
For example, in the conventional correction mode, the same confidence measure is used to analyze every word input by speech recognition. However, word recognition accuracy in recognizing continuous speech in a specific domain can reach 90%, while a word that tends to be incorrectly recognized is an Out-Of-Vocabulary (OOV) word; meaning the word is not included in a speech recognition system vocabulary.
In voice inputting a short message, an email, a query entry, etc., nouns usually carry the main idea of the input content. Among all the nouns, named entity vocabulary, which mainly includes person names, place names, and organization names, has a large proportion. Because the named entity vocabulary is a constantly growing vocabulary set and is continuously updated, many named entity words are OOV words to a speech recognition system, which will tend to cause incorrect recognition. The conventional methods do not focus on locating such named entity vocabulary that often carry the main idea of input content, and tend to be incorrectly recognized.
Further, when correcting an incorrectly recognized word, the conventional methods only focus on providing correction methods for the user by re-inputting a correct word, but do not consider richer or more convenient correction ways.