Conventionally, a voice recognition apparatus for recognizing vocabulary, words or phrase intentionally output from a user when voice is input from the user into the apparatus, is well known. Various systems having a user interface with using the voice recognition apparatus are capable of being operated by the user in a handsfree manner. Thus, the systems are easily utilized by a person.
The voice recognition apparatus compares time series feature obtained from a voice signal input into the apparatus and a reference voice pattern corresponding to a respective word or phrase, which is preliminary registered. Then, the apparatus calculates a degree of similarity for representing a degree of relatedness between the reference voice pattern and the feature. The apparatus recognizes voice sounded by the user when the voice corresponds to the reference voice pattern having the highest degree of similarity.
Here, the voice recognition apparatus is well known such that the apparatus starts to execute a process for recognizing voice when the user operates a utterance start button for informing the apparatus of the start of utterance. It is necessary for the user to operate the button of the apparatus at every time when the user utilizes the apparatus. Thus, it is bothersome for the user to utilize the apparatus. When a navigation apparatus having the voice recognition apparatus is mounted on a vehicle, a driver as the user has to sound words or phrase showing a destination again, for example, or execute operation corresponding to the sounding the words or the phrase when the user sets the destination based on conversation of the user. Accordingly, the user may conceive that the apparatus is bothersome.
To resolve the above difficulty, the voice sounded by the user is always input into the apparatus, and the apparatus executes the voice recognition. Then, the apparatus displays recognition results, and the user executes determination operation. Thus, the apparatus executes a process based on the recognition results. This apparatus is disclosed in JP-A-2008-14818.
Since the above apparatus always executes the voice recognition, the apparatus displays the recognition results corresponding to words or phrase, which are not necessary to recognize with the apparatus. Thus, the user may be bothered and unpleasant. Accordingly, it is preferable to have a reject function in the apparatus. The reject function provides to determine whether the recognition results are correct before displaying the recognition results and not to display the recognition results and to delete the recognition results when the recognition results are not correct.
The above determination with respect to correct or incorrect recognition results may be performed such that the apparatus determines whether the degree of similarity is equal to or larger than a predetermined threshold as a correct-incorrect determination value. Specifically, when the degree of similarity is equal to or larger than the threshold, the apparatus determines that the recognition results are correct. When the degree of similarity is smaller than the threshold, the apparatus determines that the recognition results are not correct.
Here, the threshold may not be constant. Alternatively, the threshold may be changed according to environmental conditions so that usability of the apparatus is improved. For example, when the apparatus executes the voice recognition under a condition that there is a noise around the apparatus, and the apparatus determines whether the recognition results are correct, the apparatus adjusts the threshold according to magnitude of the noise in the environment so that a ratio of determination that the recognition results are correct is controlled to be constant without depending on the magnitude of the noise. This apparatus is disclosed in JP-A-2001-34291. Specifically, for example, as shown in FIG. 9, even when the same voice is input into the apparatus, the bigger the noise, the smaller the similarity degree. This is shown as multiple points SP. Thus, the threshold is changed such that the threshold is small when the noise is large. This is shown as a line ST.
In the above apparatus, the ratio of display of the recognition results is controlled to be constant without depending on the magnitude of the noise around the apparatus. However, even when the magnitude of the noise is same, and the display ration is same, the unpleasant degree of the user may be changed according to a condition of the user.
For example, when the voice recognition apparatus is mounted on the vehicle, and the vehicle stops temporarily, it is not necessary for the driver to pay attention to the driving of the vehicle, and thereby, the driver may easily find the display of the recognition results. Accordingly, the driver may very frequently find the display of the recognition results corresponding to the utterance, which is not necessary to be recognized. Thus, the display bothers the driver. Further, when the vehicle runs in an urban area on a rainy day, it is necessary for the driver to concentrate the driving of the vehicle. Therefore, the driver may not find the display of the recognition results. Thus, the driver may not find the display of the recognition results corresponding to the utterance, which is not necessary to be recognized. Thus, sense of unpleasantness of the driver is small.
Accordingly, in the apparatus disclosed in JP-A-2001-34291, the sense of unpleasantness of the driver is not always reduced when the recognition results corresponding to the utterance, which is not necessary to be recognized, is displayed.
It is considered that the threshold is set to be large so as to reduce the frequency of the display of the recognition results corresponding to the utterance, which is not necessary to be recognized. However, in this case, the recognition results are not displayed easily. Thus, the usability of the voice recognition apparatus is reduced.