A speech uttered by a speaker is conventionally used for controlling a strobe light or a shutter of a camera. For instance, in JP-A-S64-56428, a camera control system using a voice input is described as follows: a speech corresponding to required manipulation is inputted; the speech is recognized by a voice recognition unit; and the camera is controlled based on a control processing corresponding to a recognition result.
In this voice-controlled camera, a certain function can be executed by a certain voice input having one-to-one correspondence with the certain function. For instance, only “no strobe” is functional as the certain voice input for prohibiting a strobe light at shooting, even though “strobe off,” “stop strobe,” or “flash off” may be used depending on a user.
By contrast, in JP-A-2000-214525, a different speech can be also functional as a voice input for executing a predetermined function of a voice-controlled camera. In this voice-controlled camera, a plurality of speeches are stored as the voice inputs corresponding to the predetermined function. Inputting any one of the plurality of the speeches thereby enables the predetermined function to be executed.
This kind of the voice recognition system that accepts different input speeches is adopted not only for the voice-controlled camera but also a car navigation device. For instance a user can use either “zoom in” or “enlarge” as a voice input in switching a scale of a road map so that the car navigation device can execute enlargement of the road map. Furthermore, the car navigation device notifies the user of content of executed function through a guidance voice. For instance, as a user utters “zoom in,” the car navigation device notifies to the user “MAP IS TO BE ZOOMED IN” as the guidance voice.
However, in the above car navigation device, even when the user utters “enlarge” in stead of “zoom in,” the device similarly notifies to the user “MAP IS TO BE ZOOMED IN.” In this case, “ZOOMED IN” that is different from “enlarge” is included in the guidance voice, so that the user may misunderstand that the inputted speech of “enlarge” is mis-recognized.