Described below is a method for speech recognition from a predefinable vocabulary.
Speech recognition systems which can recognize individual words or word strings from a predefinable vocabulary are customarily used for operating telephones or non-safety-related components of the equipment of a motor vehicle by spoken commands. Further known examples relate to the operation of surgical microscopes by the operating physician and to the operation of personal computers.
In the operation of a car navigation system, for example, a desired destination location can be communicated by speech input. Two methods for doing this are known, and these are set out briefly below.
According to a first method, the over 70,000 possible German destination locations are grouped by region. This gives rise to a total of approx. 1,000 regions which are respectively characterized by a large central location. Since the assignment of small to large locations is not unambiguous and is also difficult for a user to place, individual locations are assigned to a plurality of regions. The location “Holzkirchen” lies, for example, both in the “Munich” region and in the “Bad Tölz” region. The inputting of a destination location is effected in a user dialog in two stages, the user first specifying a major town close to the desired destination location. After the destination region has been recognized, optionally after selection from a menu, the user is prompted to name the precise destination location within the destination region. From the recognizer hypotheses, the user can then confirm the desired input by voice or on a keyboard. The navigation data associated with a region is stored on a DVD in a coherent block, as a result of which the search procedure for data on a DVD can be speeded up considerably.
In a second method, a user communicates a destination location to the navigation system by spelling out the initial letters. With the aid of the recognized sequence of letters, the navigation system determines from the set of all locations the particular locations whose initial letters are similar to the recognized letter sequence. In a menu, the locations, sorted according to similarity, are presented to the user for further selection. The user can in turn then specify by voice input or via a keyboard the desired destination location.
A disadvantage of the first method is that users have to perform at least twice the inputting of their destination location. Also, the method is connected with further setbacks in terms of convenience since the assignment of small to large locations is not unambiguous and, moreover, requires prior geographical knowledge by the user. In the case of the second method, for each recognition procedure a search has to be performed within the complete list of all possible locations in order thereafter to compile an appropriate menu. The loading times and processing times before the menu is displayed are very time-intensive, which is why the method finds little acceptance among users.