The invention relates to a correction device for correcting a text recognized by a speech recognition device for a spoken text, where the recognized text for spoken words of the spoken text includes correctly recognized words and incorrectly recognized words.
The invention further relates to a correction method for correcting a text recognized by a speech recognition device for a spoken text, where the recognized text for spoken words of the spoken text includes correctly recognized words and incorrectly recognized words.
Such a correction device and such a correction method are known from the document U.S. Pat. No. 5,909,667, in which a dictation device is disclosed. The known dictation device is formed by a computer which operates speech recognition software and word-processing software. A user of the known dictation device can speak a text into a microphone connected to the computer. The speech recognition software forming a speech recognition facility assigns a known word to each spoken word of the spoken text, by which means a recognized text is obtained for the spoken text. The recognized text contains so-called correctly recognized words, which match the words that the user actually spoke, and so-called incorrectly recognized words, which do not match the words that the user actually spoke. The recognized text is presented on a screen connected to the computer, by the word-processing software forming a word-processing facility.
The known dictation device also forms a correction device, which contains both the word-processing software and the speech recognition software, and with which incorrectly recognized words can be replaced with correction words. For this purpose, the user marks the incorrectly recognized word, inputs the correction word or words with a keyboard of the computer, and then enters a confirmation, causing the marked incorrectly recognized word to be replaced by the input correction word.
To simplify the marking of the incorrectly recognized word to be replaced, the user of the known dictation device can speak the incorrectly recognized word to be replacedxe2x80x94a so-called marker wordxe2x80x94into the microphone once again. The speech recognition software thereupon recognizes a recognized marker word for this spoken marker word, and the word-processing software searches for the recognized marker word in the words of the recognized text. If the recognized marker word is found through a comparison of letter sequences of words in the recognized text, the word-processing device will mark this marker word. After speaking the marker word, the user must check whether the word to be replaced was actually marked. If this is the case, the user inputs the correction word and a confirmation using the keyboard, to implement the replacement.
With the known dictation device, the disadvantage has emerged that it is precisely those incorrectly recognized words contained in the recognized text which are difficult for the speech recognition software to recognize, so that a high error rate also occurs in the recognition of marker words. As a result, other words of the recognized text information rather than the incorrectly recognized words are relatively often marked for replacement, which means additional work. A further disadvantage of the known dictation device has emerged in that the user must execute relatively many different actions (microphone and keyboard) in order to replace an incorrectly recognized word.
It is an object of the invention to create a correction device as specified in the first paragraph and a correction method as specified in the second paragraph in which the aforementioned disadvantages are avoided.
To achieve the above object, inventive features are provided in such a correction device so that the correction device can be characterized in the following way. A correction device for correcting a text recognized by a speech recognition device for a spoken text, where the recognized text for spoken words of the spoken text includes correctly recognized words and incorrectly recognized words, with
input means for receiving at least one manually input correction word in order to replace at least one of the incorrectly recognized words with at least one correction word, and with
transcription means for phonetic transcribing of at least the input correction word into a phoneme sequence, and with
search means for finding the phoneme sequence of the at least one correction word in phoneme sequences of the words of the recognized text and for issuing position information which identifies the position of at least one word within the recognized text whose phoneme sequence essentially matches the phoneme sequence of the at least one correction word, and with
output means for issuing the position information so as to enable a marking of the at least one word identified by the position information in the recognized text information.
To achieve the above object, inventive features are provided in such a correction method so that the correction method can be characterized in the following way.
A correction method for correcting a text recognized by a speech recognition device for a spoken text, where the recognized text for spoken words of the spoken text includes correctly recognized words and incorrectly recognized words, the following steps being processed:
receiving at least one manually entered correction word so as to replace at least one of the incorrectly recognized words with at least one correction word;
phonetically transcribing at least the input correction word into a phoneme sequence;
searching for the phoneme sequence of the at least one correction word in phoneme sequences of the words of the recognized text and issuing position information which identifies the position of at least one word within the recognized text whose phoneme sequence essentially matches the phoneme sequence of the at least one correction word;
issuing the position information so as to enable a marking of the at least one word identified by the position information in the recognized text information.
The invention is based on the recognition that the words incorrectly recognized by a speech recognition device and the words that should actually have been recognizedxe2x80x94i.e. the words to be recognized correctlyxe2x80x94very often sound very similar. For such similarly sounding words in particular, for example xe2x80x9cfourxe2x80x9d and xe2x80x9cforxe2x80x9d, the error rate of known speech recognition devices is often especially high.
As a result of the features according to the invention, the user does not need to mark an incorrectly recognized word that he wants to replace with a correction word that should actually have been recognized. The correction device determines the phoneme sequence of the input correction word by statistical methods, which phoneme sequence represents the sound of the correction word. By comparing the phoneme sequences, the correction device then searches for a word that sounds similar to the correction word in the recognized text.
Advantageously, the incorrectly recognized word very probably to be replaced in the recognized text information is thus automatically marked by the input of the correction word. The user can effect the replacement of the marked word by inputting a confirmation, or cause marking of a further similar sounding word of the recognized text information by inputting a next information.
Known correction devices of speech recognition devices enable a synchronous reproduction of the spoken words and the associated recognized words of the recognized text for the correction of incorrectly recognized words. When the user of these known correction devices notices an incorrectly recognized word, he interrupts the synchronous reproduction and executes the replacement of the incorrectly recognized word with a word put in by the user. The user then activates the synchronous reproduction again in order to find and correct further incorrectly recognized words in the recognized text.
According to the measures of claim 2 and claim 9, the advantage is gained that the synchronous reproduction is automatically interrupted as soon as the user begins to input a correction word.
According to the measures of claim 3 and claim 10, the advantage is gained that the interruption of the synchronous reproduction is automatically terminated again as soon as the user confirms by input of the confirmation that the automatically marked word should be replaced with the input correction word.
In the synchronous reproduction, the user of a correction device recognizes an incorrectly recognized word in the environment of the word, which is currently being acoustically reproduced and optically marked during the synchronous reproduction. According to the measures of claim 4 and claim 11, the advantage is gained that the search means initially look in the immediate vicinity of the word marked in the recognized text at the time of the interruption for a similar sounding word, and initially mark this. If the user should initiate a further search by entering the next information, then the search area is widened.
In a speech recognition procedure, the speech recognition device first determines a phoneme sequence associated with the spoken text, and based on this phoneme sequence recognizes the recognized text. According to the measures of claim 5 and claim 12, the advantage is gained that in their search for the phoneme sequence of the correction word, the search means use the phoneme sequence already determined by the speech recognition device. This is especially advantageous if the correction device forms part of the speech recognition device.
To increase the reliability of the search means, it has proved advantageous that phonemes that sound very similar are rated as identical phonemes in the search. Thus, for example, in phoneme sequences of English words, the phonemes xe2x80x9cvxe2x80x9d and xe2x80x9cfxe2x80x9d, and xe2x80x9ctxe2x80x9d and xe2x80x9cdxe2x80x9d, are taken to be identical in the search by the search means.