As a conventional voice recognition apparatus, an apparatus that is preliminarily provided with a word dictionary having information of recognition words registered therein and a reject word dictionary having information of reject words registered therein has been known (for example, see Japanese Laid-open Patent Publication No. 2001-147698 and Japanese Patent No. 3105863). This voice recognition apparatus compares voice information, inputted by a speaker, and information registered in the word dictionary and the reject word dictionary with each other, and specifies a word that is most similar to the inputted voice information. Moreover, when the specified word is a recognition word, the voice recognition apparatus outputs the recognition word as the result of recognition, while, when the specified word is a reject word, the inputted voice information is rejected as being unable to obtain the result of recognition.
In the structure using the above-mentioned reject word dictionary, with respect to such voice information as to form the first place in the degree of similarity to a reject word, the corresponding information is rejected whatever word dictionary may be used. For this reason, an arrangement is sometimes made so that a reject word that might give adverse effects to the result of recognition is not registered in the reject word dictionary.
More specifically, for example, suppose that an attempt is made to desirably obtain “OKINAWA” as the result of recognition of inputted voice information. Here, suppose that, upon representing a degree of similarity to inputted information by using marks with the full marks being set to 100 points, the degree of similarity of 98 points is given to “OKINAA”, that of 95 points is given to “OKINAKA” and that of 93 points is given to “OKINAWA”, with respect to the respective inputted pieces of voice information.
Here, suppose that pieces of information, “OKINAKA” and “OKINAWA”, are registered in a word dictionary and that information, “OKINAA”, is registered in a reject word dictionary. In this case, since the information “OKINAA” having the highest degree of similarity is registered in the reject word dictionary, the inputted voice signal is rejected, as being unable to obtain the result of recognition.
In contrast, suppose that, although pieces of information “OKINAKA” and “OKINAWA” are registered in a word dictionary, information, “OKINAA”, is not registered in a reject word dictionary. In this case, since the information “OKINAA” having the highest degree of similarity is not registered in any of the word dictionary and the reject word dictionary, but the information “OKINAKA” having the second highest degree of similarity is registered in the word dictionary, “OKINAKA” is outputted as the result of recognition.
As described above, with respect to such voice information as to be recognized as “OKINAA”, “OKINAKA” and “OKINAWA” in the descending order of the degree of similarity, it is not possible to obtain an appropriate result of recognition, “OKINAWA”, in none of the cases in which “OKINAA” is registered as a reject word and in which this is not registered as a rejected word.
Here, another device has been proposed in which a weight to be applied to a likelihood ratio (degree of similarity) of an unknown word model is determined for each of recognition words, and by comparing the likelihood ratio of the weighted unknown word model and the result of recognition, it is determined whether the result of recognition is adopted or rejected (for example, see Japanese Laid-open Patent Publication No. 10-171488). Moreover, still another apparatus has been proposed in which an appropriate reject word is generated in accordance with a feature of each of recognition words registered in a word dictionary so that the reject word is registered in a reject word dictionary (for example, see Japanese Laid-open Patent Publication No. 2006-154658).