1. Field of the Invention
The present invention relates to a speech recognizer control system, a speech recognizer control method, and a speech recognizer control program for recognizing a speech utterance input from a user and then acquiring information for controlling a device on the basis of the result of the recognition.
2. Description of the Related Art
In recent years, a system for a user to, for example, operate a device, uses a speech recognizer control system that recognizes speech utterances input by the user and acquires information required to operate the device. Such a speech recognizer control system recognizes voices or speech utterances input by the user, then responds or outputs voice guidance on the basis of the result of the recognition, and prompts the user for a further speech utterance, thereby interacting with the user. As a result of the recognition of the dialogues with the user, the information required to operate the device is obtained and the device is controlled.
Such a speech recognizer control system is mounted in, for example, a vehicle, to control devices, such as an audio device, a navigation device, and an air conditioner. If there is a plurality of types of devices to be controlled, then the user is required to identify the type of a device and an intended operation of the device, that is, hierarchical items, such as the type of a function of the device and an intended operation of the device to be controlled among multiple devices. This inevitably complicates the input of speech utterances by the user. Hence, there has been proposed a speech recognizer control system adapted to interactively obtain information that is necessary for the control or the like of a device while prompting a user to input missing information, thus obviating the need for speech utterance input by the user to supply information on the type, the function, the operation, or the like of the device in a hierarchical order (refer to, for example, Japanese Patent Application Publication No. 2001-249685 (hereinafter referred to as “Patent Document 1”)).
A voice interactive device, which is the speech recognizer control system in Patent Document 1, is equipped with tree-structured data for recognizing speech utterances, which is comprised of groups of hierarchical items related to the types, the functions, and the operations of devices involved. The grouped items for the recognition of speech utterances are arranged and connected in a hierarchical order. The voice interactive device obtains items of the speech utterance recognition tree-structured data that are missing in completing the tree structure on the basis of input signals received from a speech recognizer, and presumes an item intended by a user among the missing items and presents the presumed item to the user so as to prompt the user to input the required item. Then, when the tree has been formed, a signal associated with the tree is output to an external source. Based on the output signal, a response for confirmation with the user is given, and the device is controlled. At this time, if the voice interactive device cannot presume the item that is considered to be intended by the user on the basis of the input signal received from the speech recognizer, then the voice interactive device presumes the item on the basis of a last mode in which a last operation end state of the device has been stored. For example, if the last mode on “audio” is composed of “audio,” “MD,” and “first number,” and if “audio” is input, then it will be presumed that the user intends to play the first number of the MD by the audio device.
Meanwhile, in some devices, operations are automatically performed. For example, in a vehicle, when shuffle playback of an audio device or automatic control of an air conditioner is carried out, the operations of playing back a number and changing an air volume or a set temperature are automatically performed by the devices, meaning that they are operations not intended by a driver. Further, a device may be operated by speech utterances by a plurality of users. For instance, there is a case where a device is operated by speech utterances by a passenger rather than a driver. In this case also, the operations are the ones not intended by the driver. When an operational state of a device is changed irrespectively of a user, the user may stop or change the operation. At this time, a speech utterance from the user is reflectively made in response to the operation not intended by the user, or the user may not be familiar with speech utterances for operating the device probably because he/she infrequently uses the device. It is expected, therefore, that speech utterances from the user will be unclear, leading to high possibility that the speech utterances include insufficient information.
However, the voice interactive device presumes an item considered to be intended by a user by assuming that the user is very likely to select the same operation as in the last mode. In other words, the voice interactive device assumes that a device is operated by speech utterances of the same user. Therefore, if an operation is automatically performed by a device or an operation is performed by a speech utterance of another user or if a speech utterance of the user is for an operation not intended by the user, then the voice interactive device fails to properly presume the type or the like of a certain device to be controlled. This has been inconveniently leading to inefficient responses to the user or inefficient control of the device.