1. Field of the Invention
The present invention relates to an input information processing method and apparatus, serving as a front end of an image processing apparatus, for recognizing vocal information and replacing the information with an action in the form of a command sequence, and executing it.
2. Description of the Related Art
Of the media used for transmitting information between human beings, vocal sounds or speech is the most commonly and naturally used. In the meantime, with the remarkable advances in the computers, not only numerical calculation, but also various other information has become to be handled by the computer. Thus, as a medium for information transmission between a human being and a computer, there is a demand to use various media, in particular, speech, rather than code information.
In response to such a demand, a voice recognition information processing apparatus for performing operations on the basis of information input in the form of speech has appeared. In a voice recognition section of such an apparatus, it is common practice that words or sentences to be recognized are prestored, the similarities between the input speech and the word or the sentence entered are calculated, and the word or the sentence having the highest similarity is regarded as the recognized result of the input speech.
However, in the recognition performance of the voice recognition section of the above-described conventional voice recognition information processing apparatus, if a vocal sound which is input once is incorrectly recognized as a different word or sentence, the same incorrect recognition is repeated even if the voice sound to be recognized is input again, which is problematical.
Meanwhile, regarding words or sentences for voice recognition which have been entered beforehand, control commands for two-state switching, such as "switch on" or "switch off", include commands, only one of which commands is entered in any state, commands, such as a "terminate" command, which cannot be entered more than once in succession because a confirmation is required for the operation instructed, and conversely commands of relative instructions, such as "to the right" or "increases", which can be entered more than once in succession. The conventional voice recognition information processing apparatus handles input situations which are different from command to command without distinguishing the input situations.
Also, there is a conventional system having input capability on the basis of information, such as speech, which is designed to reduce incorrect recognition by dynamically switching a dictionary used for recognition to vocabulary corresponding to functions which can be executed at that time to increase the number of words which can be recognized as a whole, thus improving the ease of operation.
However, in the method of dynamically switching a dictionary used for recognition, if the voice command is incorrectly recognized and executed, the dictionary used for recognition has already been switched, the voice command uttered with intention by the user is excluded from those commands to be recognized, and even if the user inputs the voice command repeatedly, the voice command will be nullified. In such a case, the user has to perform an operation for returning to the state before the incorrectly recognized command was executed. Since the operation of the system is different depending upon the command executed, returning from the result of the operation of the system after incorrect recognition is a big burden for the user.
For example, in a case in which when the user is narrowing down the items on the menu screen, a voice command uttered so as to select an item A is incorrectly recognized and an item B is selected, the incorrectly recognized submenu B has already been displayed on the screen, and the item A is not included therein. In order for the user to correctly select the item A, the user must to perform an operation for escaping from the submenu B screen and returning to the previous menu. In another example, if the user tries to scale down the window, but closes it while a certain application is executed, the window cannot be scaled down unless the file is opened once more. In another example, if the user tries to move a picture while the screen is being edited by drawing software, but the movement is incorrectly recognized and the picture is erased; to move the picture, it is necessary to recover the erased picture and move the picture.
There is also a method and apparatus which, to prevent an improper operation due to incorrect voice recognition, informs the user of the result of the voice recognition of the command input by voice and the operation is performed after the confirmation is instructed by the user.
However, prompting the user each time always necessitates two operations, inputting and confirmation of the command, to execute one command. This is very inconvenient for the user.
Regarding the words or sentences which were determined to be incorrectly recognized, there is a high possibility that the incorrect recognition is repeated since the words which were incorrectly recognized are present in the words or sentences to be recognized.