1. Field of the Invention
The present invention relates to image processing devices, and more particularly to an image processing device which changes the state of display of a dialogue partner object in response to speech input from a user.
2. Description of the Background Art
Speech recognition devices that recognize spoken words by a user""s speech have been utilized in various fields. For example, known applications of such speech recognition devices include image processing devices (e.g., video game machines) which change the content of images (e.g., characters) displayed on the screen in response to speech commands (refer to Japanese Patent Laying-Open No. 9-230890, for example).
However, conventional image processing devices utilizing speech recognition are constructed to change images only when particular words are spoken, so that the operator must previously know the words that can be used as input to the device. If the operator does not know the predefined input words, the operator can only guess what the appropriate words may be, thereby making the image processing device very inconvenient to use. Furthermore, conventional image processing devices utilizing speech recognition do not change the display when an improper or unrecognized word is entered, thereby causing the operator to be puzzled as to whether he/she input a wrong word or the machine is malfunctioning.
Moreover, conventional image processing devices utilizing speech recognition process the results of speech recognition in a fixed way independently of the progress of the program. However, depending on the type of program executed in the image processing device, it may be preferred that the method of processing the speech recognition results is changed as the program progresses. For example, if the program executed in the image processing device is a video game program, an effective way of making the game more amusing is to change the relation between the speech recognition results and actions of the characters as the player clears several stages and becomes more skillful at playing the game. Also, when the program executed in the image processing device is an educational program for teaching language to children, an effective way for successful learning is to change the method of processing the speech recognition results so as to require the children to more correctly pronounce words as their learning progresses.
Accordingly, an object of the present invention is to provide an image processing device which can be easily used even if the operator does not know usable words prior to using the device.
Another object of the invention is to provide an image processing device which can change the way the speech recognition results are processed as the program advances.
To achieve the objects above, the present invention has the following features.
A first aspect of the present invention is directed to an image processing device for varying action of a dialogue partner object displayed on a display device in response to speech input from a user through a microphone. According to the invention, the image processing device comprises:
a converting part for converting an analog speech signal received by the microphone to digital speech data;
a speech recognition part for recognizing a word corresponding to the digital speech data converted by the converting part;
a determining part for determining whether the word recognized by the speech recognition part matches a word to be inputted at that time;
a first display control part for, when the determining part determines a word match, controlling a displayed state of the dialogue partner object to cause the dialogue partner object to perform an action corresponding to the recognized word; and
a second display control part for, when the determining part determines no word match, displaying on the display device an indication to the user that the determining part did not find a match for the word.
As stated above, according to the first aspect of the invention, a determination delivering display is provided that indicates a mismatch of words when a word different from predetermined words to be inputted are entered, thereby preventing the user from being puzzled or confused when an improper word is entered.
According to a second aspect of the invention, in the image processing device of the first aspect,
the second display control part makes a display on the display device, as the determination delivering display, to show that the dialogue partner object cannot understand the input word.
As stated above, according to the second aspect, when a word different from predetermined words to be inputted are entered, a display is made to show that the dialogue partner object cannot understand the input word speech, so that the user can more clearly recognize that he/she has entered a wrong word.
According to a third aspect of the invention, in the image processing device of the second aspect,
when the determining part continuously determines a mismatch of words over a given time period, the second display control part further displays on the display device, as the determination delivering display, a message containing a proper word to be inputted at that time.
As stated above, according to the third aspect, when a correct word is not entered over a given time period, a message that contains a correct word to be currently inputted is further displayed, which prevents the user from repeatedly entering wrong words.
According to a fourth aspect of the invention, in the image processing device of the second aspect,
when the determining part repeatedly determines a mismatch of words over a given number of times, the second display control part further displays on the display device, as the determination delivering display, a message containing a proper word to be inputted at that time.
As stated above, according to the fourth aspect, when wrong words are repeatedly entered a given number of times, a message which contains a proper word to be inputted at that time is further displayed, which prevents the user from repeatedly entering wrong words.
According to a fifth aspect of the invention, in the image processing device of the third aspect,
the second display control part controls the display on the display device so that the word to be inputted at that time and the remaining part of the message are displayed in different colors in the message.
According to a sixth aspect of the invention, in the image processing device of the fourth aspect,
the second display control part controls the display on the display device so that the word to be inputted at that time and the remaining part of the message are displayed in different colors in the message.
As stated above, according to the fifth and sixth aspects, a word to be currently inputted is displayed in a color different from the remaining part of the message sentence, so that the user can easily recognize the word to be inputted.
According to a seventh aspect of the invention, an image processing device is provided for displaying a given image on a display device according to set program data and to vary action of a dialogue partner object displayed on the display device in response to a spoken word from a user through a microphone, wherein the device comprises:
a converting part for converting an analog speech signal input to the microphone to digital speech data;
a speech recognition part for recognizing a word corresponding to the digital speech data converted by the converting part;
a display control part for controlling a displayed state of the dialogue partner object on the basis of the result of recognition made by the speech recognition part; and
a degree of progress detecting part for detecting a degree of progress of the program data;
wherein the display control part changes, in steps, the way the displayed state of the dialogue partner object is controlled in accordance with the degree of progress of the program data detected by the degree of progress detecting part.
As stated above, according to the seventh aspect, the displayed state of the dialogue partner object is controlled such that it is changed in steps in accordance with the degree of progress of the program data, which enables the dialogue to be controlled in a varied manner based on the progress of the game.
According to an eighth aspect of the invention, in the image processing device of the seventh aspect,
the display control part comprises,
a first display control part for causing the dialogue partner object to perform a predetermined action independently of the word recognized by the speech recognition part when the degree of progress of the program data detected by the degree of progress detecting part is at a relatively elementary level, and
second display control part for causing the dialogue partner object to perform a corresponding action in accordance with the word recognized by the speech recognition part when the degree of progress of the program data detected by the degree of progress detecting part is at a relatively advanced level.
As stated above, according to the eighth aspect of the invention, when the degree of progress of the program data is at a relatively elementary level, the dialogue partner object is made to perform a given action independently of the type of the recognized word. On the other hand, when the degree of progress of the program data is at a relatively advanced level, the dialogue partner object is made to perform a corresponding action in accordance with the type of the recognized word. Thus, the recognized result can influence the display control of the dialogue partner object to varying degrees in accordance with the progress of the program data.
According to a ninth aspect of the invention, in the image processing device of the eighth aspect,
the second display control part comprises,
a determining part for determining whether the word recognized by the speech recognition part matches a word to be inputted at that time, and
a corresponding action control part for, when the determining part determines a word match, causing the dialogue partner object to perform an action corresponding to the matched word.
As stated above, according to the ninth aspect of the invention, when a recognized word matches a word to be currently inputted, the dialogue partner object is made to perform an action corresponding to the matched word, so that the actions to be performed by the dialogue partner object can be arbitrarily defined by the program.
According to a tenth aspect of the invention, in the image processing device of the ninth aspect,
the speech recognition part comprises;
a dictionary part in which word data is stored as a reference,
a correlation distance calculating part for comparing the digital speech data with words in the dictionary part to calculate a correlation distance indicating degree of similarity for word in the dictionary,
a ranking part for ranking the word data stored in the dictionary part in order of similarity, starting from the highest, on the basis of the correlation distances calculated by the correlation distance calculating part, and
a candidate word data output part for outputting, as candidate word data, the word data having the highest rank among the words stored in the dictionary part to the determining part,
wherein the determining part determines whether the candidate word data provided from the candidate word data output part matches a word to be inputted at that time, wherein the determining part starts with the candidate word data having the highest similarity, and stops the determination operation when a match is determined and gives a match determination output to the corresponding action control part.
As stated above, according to the tenth aspect of the invention, starting with the candidate word data having the highest similarity, the candidate word data supplied is checked to see whether it matches a word to be inputted at that time. The dialogue partner object is made to perform the corresponding action when a match is found. Accordingly it is possible to cause the dialogue partner object to perform desired action even when the speech recognition is not very accurate.
According to an eleventh aspect of the invention, in the image processing device of the tenth aspect,
the determining part reduces the number of word data to be selected from the candidate word data and subjected to the match determination as the degree of progress of the program data detected by the degree of progress detecting part advances.
As stated above, according to the eleventh aspect of the invention, the number of pieces of word data to be selected from the candidate word data as subjects for match determination is reduced as the degree of progress of the program data advances. Thus, it is possible to provide stricter speech recognition so as to require more accurate speech input from the user as the program data progresses.
According to a twelfth aspect of the invention, in the image processing device of the ninth aspect,
the speech recognition part comprises;
a dictionary part in which word data to be inputted at that time is stored,
a correlation distance calculating part for comparing the digital speech data and each piece of the word data stored in the dictionary part to calculate a correlation distance showing the degree of similarity for each piece of word data, and
a candidate word data output part for selecting word data having the highest similarity on the basis of the correlation distances calculated by the correlation distance calculating part and outputting the selected word data and its correlation distance as candidate word data to the determining part,
and wherein the determining part
detects whether a first similarity defined by the correlation distance contained in the candidate word data is higher than a second similarity defined by a preset threshold, and
when the first similarity is higher than the second similarity, determines that the word recognized by the speech recognition part matches a word to be inputted at that time, and
when the second similarity is higher than the first similarity, determines that the word recognized by the speech recognition part does not match a word to be inputted at that time.
According to a thirteenth aspect of the invention, in the image processing device of the seventh aspect,
the program data is program data for a video game stored in a portable storage medium.
According to a fourteenth aspect of the invention, in a storage medium which contains program data executed in an image processing device for changing action of a dialogue partner object displayed on a display device in response to speech commands inputted from a user through a microphone,
when executing the program data, the image processing device
converts an analog speech signal inputted by the microphone to digital speech data,
recognizes a word corresponding to the converted digital speech data, and
determines whether the recognized word matches a particular word to be inputted at that time,
and when word match is determined, the image processing device controls the displayed state of the dialogue partner object to cause the dialogue partner object to perform an action corresponding to the recognized word, and
when no word match is determined, the image processing device makes a determination delivering display on the display device to show the result of this determination to the user.
According to a fifteenth aspect of the invention, in a storage medium which contains program data executed in an image processing device for changing action of a dialogue partner object displayed on a display device in response to speech commands inputted from a user through a microphone,
when executing the program data, the image processing device
converts an analog speech signal inputted by the microphone to digital speech data,
recognizes a word corresponding to the converted digital speech data, and
controls a displayed state of the dialogue partner object on the basis of the recognized word,
and wherein the displayed state of the dialogue partner object is controlled such that it is changed in steps in accordance with the degree of progress of the program data.
These and other objects, features, aspects and advantages of the present invention will become more apparent from the following detailed description of the present invention when taken in conjunction with the accompanying drawings.