Conventionally, a voice recognition apparatus compares plural comparison patterns of pre-stored data with a voice input, and outputs a recognition result of high matching degree. A voice recognition apparatus may be used for a voice input of a destination, which may be set to, for example, a navigation apparatus. Such voice input of a destination is especially beneficial for a user, or, for a driver of a vehicle, allowing him/her to set the destination without performing a button operation or without looking at a display screen of the navigation apparatus, thereby improving the safety of the user.
For allowing the voice input of a destination, the voice recognition apparatus should allow the user to input a specific place name or the like with ease. For instance, the user should be allowed to not only input the prefecture name and the city name but also the county, town, street name and/or a section of a village. Further, when the user would like to set the destination, such as “Main St., New York City, N.Y.,” the user may prefer to input such address as one stretch of voice (i.e., in one breath) for the ease of input rather than adding pauses between certain words or group of words (i.e., separately voicing the same address), for example, “New York” <pause>, “New York City” <pause>, “Main St”. The un-interrupted series of words, or, a command data combination input, may be referred to as a continuous input.
Therefore, to accept the continuous input, Japanese Patent Laid-Open No. 2001-306088 (JP '088) discloses a “tree structure” recognition dictionary, which hierarchically connects/combines recognition objects (i.e., voiced words), sifting the recognition vocabulary for each of the hierarchies. Alternatively, Japanese Patent Laid-Open No. 2003-114696 (JP '696) allows the continuous input of, for example, a street address of the United States, which does not fit to the tree structure recognition dictionary of JP '088, by hierarchically combing the words from a lower to a higher hierarchy, instead of combining the words from a higher to lower hierarchy, which is usually the case.
However, for allowing the continuous input, the voice recognition apparatus has to have a much larger dictionary that carries a very big recognition vocabulary, in comparison to one allowing a normal, hierarchical input. Further, as the vocabulary is expanded, a voice recognition rate of successfully recognizing words drops in general. Therefore, in the conventional voice recognition apparatus, the continuous input is limited only to one data-category such as a street address, for the purpose of improving the recognition rate. In such configuration, the apparatus should be put into a certain operation mode by a specific command, allowing an input of only one data-category, before actually performing the continuous input for the voice recognition. However, this may be cumbersome for the user, that is, an input of such a specific mode-setting command for performing the voice recognition even when such command is only one word.