1. Technical Field
The present disclosure relates to an operation assisting device and an operation assisting method that assist an operation based on the utterance of a keyword.
2. Description of the Related Art
In recent years, a technique for detecting, when a predetermined keyword is spoken, the keyword and starting a predetermined operation of an apparatus such as activating a system has been widely put to practical use.
Usually, determination of whether or not the keyword is spoken (hereinafter, called “keyword determination”) is performed by calculating an evaluation value (score) indicating a plausibility (hereinafter, called a likelihood) that a keyword is included in a spoken voice and determining whether or not the evaluation value is greater than or equal to a predetermined threshold value. The evaluation value is calculated by, for example, voice recognition processing for the spoken voice.
However, in some cases, even if a keyword is actually spoken, an ambient sound, unintelligibility of utterance, or the like causes the evaluation value to be low. In this case, such a state occurs that it is difficult to operate an apparatus despite the utterance of the keyword (hereinafter, called “detection failure”). In addition, in contrast, in some cases, the evaluation value of an ambient sound or a spoken voice other than a keyword is high despite no actual utterance of a keyword. In this case, such a state occurs that an apparatus is unintentionally operated despite no utterance of a keyword (hereinafter, called “false detection”).
Therefore, a technique in which a timing when a keyword is to be spoken is presented to a user with a voice from a device that assists an operation based on utterance and keyword determination is only performed on that timing is described in, for example, Japanese Unexamined Patent Application Publication No. 2010-281855. According to such a technique, as for an operation performed at the timing decided on the device side, it is possible to reduce the occurrences of the detection failure and the false detection.
In addition, in Japanese Unexamined Patent Application Publication No. 2012-242609, a technique is described in which the direction of eyes of a user is detected and a threshold value for keyword determination is decreased during a time interval in which the eyes of the user are directed at a robot that is an operation target. It is possible to reduce the detection failure with a decrease in the threshold value, and it is possible to reduce the false detection with an increase in the threshold value. Therefore, according to such a technique, as for an operation performed by the robot, which is caused when the user speaks to the robot, it is possible to reduce the occurrences of the detection failure and the false detection.
However, in the technique described in Japanese Unexamined Patent Application Publication No. 2010-281855, only at the timing decided on the device side, it is possible to perform an operation based on keyword utterance (hereinafter, called “keyword utterance operation”). In addition, the technique described in Japanese Unexamined Patent Application Publication No. 2012-242609 is only applicable to a use in which it is possible to look at an operation target for a relatively long time.
For example, such an operation that the driver of a vehicle turns on a car air-conditioner during driving is usually performed at any timing decided by the driver. In addition, it is difficult to perform such an operation while continuously looking at the device. Accordingly, it is difficult to apply the techniques described in Japanese Unexamined Patent Application Publication No. 2010-281855 and Japanese Unexamined Patent Application Publication No. 2012-242609 to such a use. In other words, in these techniques of the related art, it is possible to realize a highly accurate keyword utterance operation only in severely limited uses.