1. Field of the Invention
The present invention relates to a speech recognition apparatus, a navigation apparatus including a speech recognition apparatus, and a speech recognition method.
2. Description of the Related Art
A navigation apparatus is known which receives position information indicating a current vehicle position from an artificial satellite, and displays the position information indicating the current vehicle position, together with map information, on a display to guide a driver. Some navigation apparatuses contain a speech recognition apparatus for recognizing voice commands issued by a user, to set a destination or the like. Use of the speech recognition apparatus can make it easier for a user to operate the navigation apparatus. However, in order to recognize a large number of different spoken words, the speech recognition apparatus must perform a complicated speech recognition process based on comparison using voice inputs. This process is time consuming and can result in incorrect voice command recognition. One known technique to improve the recognition accuracy of the speech recognition apparatus is to reduce the number of words registered in a speech recognition dictionary, thereby minimizing the number of similar words. A specific example of such a technique is to divide a speech recognition dictionary into a plurality of sub-dictionaries and select a proper sub dictionary. By selecting a sub dictionary the number of similar words used at a time is reduced, and the accuracy of the recognition apparatus is improved.
Japanese Unexamined Patent Application Publication No. 8-320697 discloses a speech recognition apparatus which has a speech recognition dictionary divided into a plurality of sub-dictionaries. This apparatus also has the ability to adaptively select a proper sub dictionary depending on the situation. For example, when this apparatus is implemented in a car navigation system, the speech recognition dictionary is divided in advance into sub-dictionaries according to names of cities or towns, and the proper sub dictionary is selected based on the city/town name corresponding to the vehicle's current position. This speech recognition apparatus improves the recognition accuracy and reduces the recognition time by adaptively switching the sub dictionary depending on the vehicle's current location.
However, the known techniques have several problems. The speech recognition apparatus as disclosed in Japanese Unexamined Patent Application Publication No. 8-320697 improves voice input recognition accuracy for commands, such as setting a destination search condition, by dividing the speech recognition dictionary into sub-dictionaries according to the current vehicle position, such as the city/town name. However, this technique does not have sub-dictionaries created based on words used as operation commands to control the operation of the navigation apparatus. In this type of speech recognition apparatus, a navigation apparatus operation voice command is compared with all words registered in the speech recognition dictionary. FIG. 6 shows a process typically performed by a navigation speech recognition apparatus to recognize a voice input, by comparing it with words registered in a speech recognition dictionary in order to produce a recognition result. As shown in FIG. 6A for example, when a voice command of “next” is input by a user, the speech recognition apparatus performs a recognition process that includes comparing the voice input with words registered in the speech recognition dictionary.
During the recognition process, words whose similarity with the voice input “next” that are higher than a predetermined level are extracted as recognition candidates from the speech recognition dictionary. In the present example, as shown in FIG. 6B, “next,” “neck,” and “nick” are extracted from the speech recognition dictionary shown in FIG. 6A. The similarity is quantitatively expressed, for example, as a score. The speech recognition apparatus then selects the word with the highest score from the recognition candidates, and provides the selected word as the recognition result. In the example shown in FIG. 6B, the word “next” has a score of “98.” Because “98” is the highest score of the recognition candidates, “next” is provided as the recognition result. However, problems arise in noisy environments as noise is input together with the voice command to the speech recognition apparatus, thereby increasing the possibility that an incorrect word will be selected as the recognition result. FIG. 6C shows an example of the speech recognition apparatus providing the wrong recognition result in a noisy environment. In this example the user inputs a voice command of “next,” however, due to the noisy environment, the speech recognition apparatus incorrectly evaluates the word “neck” as having the highest score of “98,” and provides the word “neck” as the recognition result.
FIG. 7 shows examples of words that are difficult for speech recognition apparatuses commonly used in navigation apparatuses to correctly distinguish. When the speech recognition apparatus compares a voice input with all the words registered in the speech recognition dictionary, if the voice input includes noise there is a possibility that the apparatus may provide the wrong recognition result. For example, for a mail reading screen, when the user issues a voice command of “next,” there is a possibility that the apparatus will incorrectly recognize the voice input as “neck” or “nick.” In another example, when a voice command of “CD” is input by a user to control the operation of an audio device, there is a possibility that the speech recognition apparatus incorrectly recognizes the voice input as “seed” or “she'd.” Similarly, when a voice command of “scan” is input by the user to control the operation of the audio device, there is a possibility that the speech recognition apparatus will incorrectly recognize the voice input as “fan” or “can.”
Because the recognition process compares the voice input command to all the words registered in the recognition dictionary, the probability of incorrect voice input recognition tends to increase with the number of words registered in the speech recognition dictionary. Incorrect recognition also tends to increase as the length of the voice input decreases. One possible technique to prevent such incorrect recognition is to use alternative words with longer lengths having the same meaning, instead of short words. However, if the user is not allowed to use the short words they are accustomed to, the apparatus will not be convenient to use and will have no commercial value.
In view of the above, it is an object of the present invention to provide a speech recognition apparatus, a navigation apparatus including a speech recognition apparatus, and a speech recognition method capable of performing high-accuracy, substantially error free recognition of voice inputs.