The invention relates to a method and a device for the voice recognition of a word.
Voice recognition systems are becoming increasingly widespread in many areas of technology.
In the case of dictation systems, voice recognition techniques are used for the automatic creation of written text. Dictation systems of this type are based on the recognition of individual words or syllables. Apart from the word or syllable recognition, they often have a spelling mode, which, if it fails to recognize a word, prompts the user to say the word letter by letter.
Other known voice recognition applications are based from the outset on a letter-by-letter input of a word. Systems of this type are referred to as spelling recognition units. Spelling recognition units are used for example in navigation systems for motor vehicles with voice input of destination information. The navigation system must be able to distinguish between a very large number of to some extent similarly sounding words (names of towns, street names, names of hotels, restaurants and bars, etc.), which can be ensured with adequate certainty by letter-by-letter input of the word. However, it is disadvantageous that a relatively high degree of concentration is required for the spelling, which cannot always be provided when maneuvering a vehicle.
It is accordingly an object of the invention to provide a voice recognition method and an associated device which overcome the above-mentioned disadvantages of the prior art methods and devices of this general type, which, in the case in which a word is not definitely recognized, proceeds in a word recognition mode through a user-friendly sequence for finding the word being sought.
With the foregoing and other objects in view there is provided, in accordance with the invention, a method for interactive voice recognition of a word by a voice recognition system. The method includes performing a word recognition mode by the steps of: converting a spoken word into an electrical word voice signal; and analyzing the electrical word voice signal for recognizing the spoken word from a vocabulary of predetermined words. The following steps are performed if a definite assignment of the electrical word voice signal to a word from the vocabulary of predetermined words cannot be made: compiling a preselection of words from the vocabulary of predetermined words which have a sufficient probability of being the spoken word; determining for each word forming the preselection of words, at least one decisive letter which makes the word distinguishable from other words in the preselection of words; inquiring which of the decisive letters for the words of the preselection of words is appropriate during a spelling recognition mode; converting a spelling voice input into an assigned electrical spelling voice signal; and analyzing the assigned electrical spelling voice signal for recognizing the decisive letter.
The invention is based on the realization that it is generally not necessary to make the user spell the word not definitely recognized in the word recognition mode from the beginning in the spelling recognition mode. Rather, it is sufficient to make a limited number of words contained in the preselection list distinguishable on the basis of suitably chosen decisive letters and then to determine in the spelling recognition mode the word being sought by specific inquiry of the decisive letter assigned to the word being sought.
In the inquiry of the decisive letter, the decisive letters previously determined with respect to the words of the preselection list are preferably suggested to the user by the voice recognition system, thereby increasing the interactivity of the system.
Although, in principle, the inquiry can also take place visually, for example, in the case of many applications it is expedient to provide an acoustic inquiry.
In practice, it may happen that, on account of suddenly occurring ambient noises or initially indistinct pronunciation by the user, a repetition of the spoken word appears expedient. An advantageous refinement of the method according to the invention is therefore characterized in that the user is prompted to repeat the spoken word if the number of words contained in the preselection list exceeds a predetermined limit value. In the repetition of the word, there may be fewer disturbances through ambient noises and experience shows that the user endeavors to speak more clearly, so that usually a more favorable preselection is available as a result for the following inquiry and spelling recognition steps than in the case of the first attempt.
The spelling recognition mode may be an alphabet-word recognition mode or a letter recognition mode. In the first case, the operator convenience can be increased in an advantageous way by a number of different alphabet words being assigned to an individual letter (for example xe2x80x9cAntonxe2x80x9d, xe2x80x9cAlphaxe2x80x9d, xe2x80x9cAlfredxe2x80x9d for the letter a). The user then has several possibilities to name a letter in the alphabet-word recognition mode.
With the foregoing and other objects in view there is provided, in accordance with the invention, a device for voice recognition. The device contains a word recognition unit for converting a spoken word into an electrical word voice signal and for analyzing the electrical word voice signal for recognizing a word from a vocabulary of predetermined words. A selection logic is provided, which, if a definite assignment of the electrical word voice signal to the word of the vocabulary of predetermined words cannot be made, compiles a preselection of words from the vocabulary of predetermined words among which the spoken word is located with sufficient certainty. The selection circuit is coupled to the word recognition unit. A logic circuit is provided for determining for each word of the preselection of words, at least one decisive letter which makes the word distinguishable from other words of the preselection of word. The logic circuit is coupled to the word recognition unit. An output unit is provided for outputting the decisive letter for each of the words of the preselection of words. The output unit is coupled to the word recognition unit, and a spelling recognition unit for converting a spelling voice input into an associated electrical spelling voice signal, is provided. The spelling recognition unit also analyzes the associated electrical spelling voice signal for recognizing the decisive letter. The spelling recognition unit is connected to the word recognition unit.
The device according to the invention is used with particular advantage in a navigation system for a vehicle, in particular a motor vehicle, since the attention of the driver is distracted from the road traffic only to very small degree on account of the simple procedure for inputting destination information into the system.
Other features which are considered as characteristic for the invention are set forth in the appended claims.
Although the invention is illustrated and described herein as embodied in a voice recognition method and an associated device, it is nevertheless not intended to be limited to the details shown, since various modifications and structural changes may be made therein without departing from the spirit of the invention and within the scope and range of equivalents of the claims.
The construction and method of operation of the invention, however, together with additional objects and advantages thereof will be best understood from the following description of specific embodiments when read in connection with the accompanying drawings.