The invention relates to speech recognition technology and, more particularly, to a method for correcting error characters in results of speech recognition and a speech recognition system using the same.
Speech recognition technology is a technology for accurately recognizing human speech (for example, character, word, sub-sentence, sentence, etc.) by using computer and digital signal processing technology. Speech recognition is based on collecting the various valuable speech characteristics to be recognized, forming the acoustic model to be recognized, and comparing with the sample model stored in the computer, as well as recognizing what are the characters and words through pattern classification methods. The speech recognition process is a recognition process for syllable or word language composition, etc. No doubt speech recognition is a fast and effective way to input text into a computer. Though a great deal of research for speech recognition has been performed until now, recognition in continuous speech, speaker independent, and long word is in the exploring stage because of the complexity of the language. Thus, error correction for the results of speech recognition is an indispensable step, because the accuracy of speech recognition can never reach 100%.
Friendliness and efficiency of the alternative input methods in the error correction process is very important since it is a part of the complete speech input process, and it can be a deciding factor regarding user""s acceptance of the speech input methods. Generally, different input methods such as handwriting input or various types of stroke-based input have been used to correct the error characters in the results associated with speech recognition because the users of the speech recognition system often do not want to use a keyboard or are not familiar with it, and these users more desirably use the stroke-based handwriting input methods, such as handwriting input, stroke-based input or stroke type-based input, which are approximate to the natural handwriting habits. However, such handwriting recognition technology is not a mature technology, thus the error correction efficiency for the results of speech recognition is reduced. The current various error correction methods so far used for the results of speech recognition do not taken advantage of the useful acoustic information generated from the speech recognition process.
An object of the invention is to use effectively the useful acoustic information generated from the speech recognition process, so as to improve the error correction efficiency of speech recognition, that is, to improve the reliability and speed of the error correction.
The invention fully exploits the useful acoustic information obtained in the speech recognition process to maximize the error correction efficiency for the results associated with speech recognition by using the alternative stroke-based input methods. The invention automatically retains and processes the valuable acoustic information from the speech recognition process. This is accomplished via internal data transfer and incorporation of an evaluation procedure involving several statistical models. The invention uses a confusion matrix to generate an acoustic model, and the acoustic model cooperates with character level and word level language models to optimize the error correction processing.
According to an aspect of the invention, a method for correcting one or more error characters in results of speech recognition comprises the steps of:
marking the one or more error characters in the speech recognition results;
inputting one or more correct characters corresponding to the one or more marked error characters by input based on character-shape;
recognizing the input based on character-shape;
displaying one or more candidate characters;
selecting one or more desired characters from the one or more candidate characters in accordance with the user; and
replacing the one or more error characters with the one or more selected characters;
the method characterized by further comprising the step of filtering the one or more candidate characters in accordance with acoustic information associated with the one or more error characters.
According to another aspect of the invention, a speech recognition system capable of correcting one or more error characters in results of speech recognition comprises:
voice detection means for collecting a speech sample of a user;
pronunciation probability calculation means, which, for each pronunciation in an acoustic model, gives a probability estimation value of whether the pronunciation is the same as the speech sample;
word probability calculation means, which, according to a language model, gives a probability estimation value of a word occurring in a current context;
word matching means for calculating a joint probability through combining a probability value calculated by the pronunciation probability calculation means with a probability value calculated by the word probability calculation means and taking the word with the greatest joint probability value as the result of the speech recognition;
context generating means for modifying the current context by using the speech recognition result; and,
word output means;
the speech recognition system characterized by further comprising error correction means, user marking the one or more error characters in the results of the speech recognition via the error correction means, inputting one or more correct characters corresponding to the one or more error characters by input based on characters-shape, and the error correction means recognizing the input, generating one or more candidate characters and filtering the one or more candidate characters via acoustic information associated with the one or more error characters.