The present invention relates to a document creation system such as a word processor or a word processor program executed on a computer, and more particularly to a voice-input document creation system which automatically recognizes voice input data and allows a user to create and edit documents.
Conventionally, this type of voice-input document creation system (hereafter called xe2x80x9cvoice-input word processorxe2x80x9d) has been used by computer beginners to enter kana-kanji mixed sentences into a computer straightforwardly without using a keyboard. An earlier patent disclosure dealing with this is found in Japanese Patent Kokai Publication JP-A-3-148750. During document creation, the voice-input word processor disclosed in that publication allows a user to correct, via voice input, a wrongly-recognized input upon drafting the sentences. This voice-input word processor has a voice recognition module which recognizes a plurality of output candidates for a context-basis or word-basis voice input. It has a first memory storing a feature content of each entered voice input, a second memory storing a plurality of candidates recognized by the voice recognition module for each voice input, and a correction determination module which compares the feature content of the newest voice input with the feature content saved in the first memory to determine if the newest voice input is the correction of the voice input entered immediately before the last. When the correction determination module determines that the newest voice input is the correction of the input entered immediately before the last, the word processor displays the next candidate saved in the second memory as the recognized result of the newest voice input.
However, this conventional voice-input word processor has the following problems.
The first problem is that the maximum number of corrections the user must make equals (the number of output candidates minus one).
The reason is that, for the newest voice input, only one candidate is displayed at a time, beginning with the top-priority candidate in the descending order.
The second problem is that, in a case where a voice input is wrongly recognized and no output candidate is the desired clause or word, the user cannot get the desired clause or word even after the corrections noted above.
The reason is that, depending upon how the last input was recognized, the output candidates do not include the desired clause or word. In this case, the conventional word processor requires the user to delete the recognized result of the last input and then to re-enter the voice input, resulting in cumbersome operations and decreased efficiency.
The present invention seeks to solve the problems associated with a prior art described above. It is an objective of the present invention to provide a voice-input document creation system which requires the user to enter a correction to a wrongly-recognized result only up to about two times, thus eliminating the need to enter a correction many times and ensuring input efficiency, operability, and ease of use.
According to a first aspect of the present invention, there is provided a voice input document creation system, particularly a voice-input word processor, which has a speech recognition module for recognizing a plurality of candidates for a clause-basis or a word-basis voice input, wherein when the system receives a voice input equivalent to an immediately preceding voice input, the system assumes that the user has made a correction to the immediately preceding voice input and displays on the screen all the output candidates for the immediately preceding voice input. When the user consecutively makes another correction, the system assumes that there was no desired clause or word in the output candidates and displays a list of output candidates for the newest input.
According to a second aspect, a voice input document creation system comprises: a speech recognition module for recognizing a plurality of output candidates in response to a voice input, means for comparing a feature content of a newest voice input with a feature content of an immediately preceding voice input to determine if the newest voice input is a correction to the immediately preceding voice input, wherein, upon a first-time correction, a list of all output candidates for the immediately preceding voice input is displayed, and wherein, upon a second-time correction, a list of output candidates for the newest voice input is displayed, the list of output candidates excluding output candidates displayed upon the first-time correction.
In a third aspect, a voice input document creation system comprises: means for comparing a feature content of a newest voice input with a feature content of an immediately preceding voice input to determine if the newest voice input is a correction to the immediately preceding voice input; and means for displaying a list of all output candidates for the immediately preceding voice input when the last voice input is determined to be the correction to the immediately preceding voice input on the assumption that the input was retried, and, when the same voice input is entered again, for displaying the list of output candidates for the newest voice input on the assumption that the output candidates do not include a desired clause or word.
In a fourth aspect, a voice input document creation system comprises: an input device receiving voices; a feature extracting device extracting a feature content of a voice input received via the input device; a first memory in which the feature content of the voice input is saved; a second memory in which at least one output candidate for a newest voice input is saved; determination module for comparing the feature content of the newest voice input with the feature content of a voice input immediately preceding the newest voice input saved in the first memory and for determining if the newest voice input is a correction to a voice input immediately preceding the newest; comparison module comparing a feature content of the newest voice input with a feature content of each clause or word stored in a recognition dictionary to select at least one output candidate; a third memory in which the at least one output candidate for a second-time correction is saved; and a display. In this system, when the determination module determines that the newest voice input is the correction to the voice input immediately preceding the newest and when a list of output candidates is not yet displayed, all recognized results of the voice input immediately preceding the newest are displayed on the display as a list of output candidates, the recognized results being saved in the second memory. And, when the determination module determines that the newest voice input is the correction and when the list of output candidates is already displayed, a determination is made that the displayed list of output candidates does not include a desired clause or word and the list of output candidates is replaced by a list of output candidates for the newest voice input.
In a fifth aspect, upon a first-time correction, a list of all output candidates for the voice input immediately before the newest is displayed, the output candidates being saved in the second memory, wherein, upon a second-time correction, output candidates for the newest voice input are saved in the third memory and a list thereof is displayed, the output candidates displayed upon the first-time correction not being included in the list, nor being saved in the third memory.
In a sixth aspect, when the determination means determines that the voice input is not a correction, the feature content of the newest voice input is saved in the first memory for use upon entering of next voice input and, at the same time, all recognized results of the last voice input are saved in the second memory and wherein a top-priority output candidate is displayed at a sentence input location on the display means.
In a seventh aspect, when the determination means determines that the voice input is not a correction and the list of output candidates is already displayed, a determination is made that the list of output candidates does not include a desired clause or word and, when the list of output candidates is replaced by the list of candidates for the newest voice input, the recognized results already saved as the recognized results of the original voice input are not displayed.
In a eighth aspect, there is provided a recording medium having stored therein a program which causes a computer to execute a voice input document creation system having a speech recognition module for recognizing a plurality of output candidates in response to a voice input entered on a clause or word basis. The program comprises:
(a) comparing a feature content of a newest voice input with a feature content of a voice input immediately before the newest to determine if the newest voice input is a correction to the voice input immediately before the newest; and
(b) displaying a list of all output candidates for the voice input immediately before the newest when the newest voice input is determined to be a correction to the voice input immediately before the newest on the assumption that the input be retried; and, when the same voice input is subsequently entered again, displaying the list of output candidates for the newest voice input on the assumption that the output candidates did not include a desired clause or word.