A voice operation for performing an operation using voice alone is known as a method of operating a vehicle or an on-vehicle instrument easily while driving the vehicle. Using voice operations, operations such as periphery searches and music searches can be performed. For example, when an utterance such as “nearby restaurants” is made, a search for nearby restaurants is performed and search results are output as candidates. The user can then perform an operation to select a desired restaurant from the candidates.
However, in order to select a desired one from the candidates obtained as the search results through a voice operation, the user must check the candidates one by one and perform a determination operation. Therefore, numerous voice operations are required, which is troublesome and makes intuitive operations impossible. Furthermore, in a voice operation, voice recognition results are displayed on a monitor of a navigation device, for example, and therefore the operator must look at the monitor to check the results. Therefore, there is a problem such that the operator's line of sight may shift, thus causing an inconvenience during driving.
To reduce the troublesomeness of a voice operation, a multi-modal constitution in which operations can also be performed using a remote controller (hereinafter, abbreviated to ‘remocon’) or a touch panel may be adopted, but in the case of a remote controller, there is a limit to the number of operations that can be performed using a fixed number of keys, and moreover, when a large number of keys is provided, operations become complicated and the operations allocated to each key cannot be learned easily, making intuitive operations impossible. In the case of a touch panel, operations are performed by touching the monitor, and therefore the operator must necessarily look at the monitor, thus involving a shift in the operator's line of sight. Hence, there is a problem such that the convenience of touch panel operations during driving is poor.
In order to solve these problems, Patent Document 1 discloses an information input device with which an on-vehicle instrument can be operated while maintaining a driving posture by using spatial operations instead of voice operations. In this information input device, a driver raises a hand so as to enter a range of a virtual space and then opens the closed hand. This hand movement is picked up by a camera and when a resulting image and position correspond to a predetermined image and position, a standby state in which input is possible is established. The driver then uses the hand to grab a desired menu space from among a plurality of menu spaces provided in the virtual space. This movement is picked up similarly by the camera, whereupon the movement and position of the hand are recognized and the menu space grabbed by the hand is determined. A determination result is then supplied to a navigation device. The determination result is also called back to the driver by voice.
Further, Patent Document 2 discloses an on-vehicle instrument interface with which the instrument can be operated using a combination of voice recognition and spatial operations. According to this on-vehicle instrument interface, when characteristic data are matched between an input voice based on an utterance of an operator and a demonstrative pronoun registered in a voice dictionary, and further it is confirmed within a preset allowable time period that a gesture (a hand shape) of the operator matches a registered pattern registered in a hand dictionary, the on-vehicle instrument interface specifies an on-vehicle instrument associated with the matching registered pattern as a subject instrument and obtains an operational state of the subject instrument. A control command for switching the acquired operational state to another operational state is then created, whereupon the control command is transmitted to the subject instrument. In other words, the subject instrument having the operational state to be switched is specified by a combination of a voice (the demonstrative pronoun) and a gesture (pointing).    Patent Document 1: Japanese Unexamined Patent Publication No. 2000-75991    Patent Document 2: Japanese Unexamined Patent Publication No. 2005-178473
However, in the technique disclosed in Patent Document 1, the menu for the spatial operation is displayed on a monitor of the navigation device and the operator must perform an operation relying on voice callbacks on the assumption that the menu is virtually provided in a space which there is no existence in reality. Hence, intuitive operations are impossible. Moreover, to check the menu display, the operator has to look at the monitor of the navigation device, and therefore the operator's line of sight may shift, which is problematic in operations during traveling. Furthermore, with this technique, all operations are performed using spatial operations and voice operations are not performed. Therefore, in the case of a deeply hierarchical operation such as a search, there is a problem such that a large number of selection procedures must be performed on spatial operations.
Further, with the technique disclosed in Patent Document 2, voice operations such as utterance of the demonstrative pronoun “this” or the like are simply used just for triggering spatial operations, and therefore the various words required for searches and so on are not recognized. Moreover, with this technique, the operation subject instrument is specified and operated by pointing at the instrument in a spatial operation, and a display unit such as a monitor is not provided. Therefore, only simple operations such as switching on/off of the pointed instrument can be performed.