1. Field of the Invention
This invention relates to a speech recognition system for selecting, via a speech input, an item from a list of items.
2. Related Art
In many applications, such as navigation, name dialing or audio/video player control it is necessary to select an item or an entry from a large list of items or entries, such as proper names, addresses or music titles. To enable speech recognition with moderate memory and processor resources, a two-step speech recognition approach is frequently applied. In a first step, a phoneme sequence or string is recognized by a phoneme recognizer. However, the recognition accuracy of phoneme recognition is usually not acceptable and many substitutions, insertions and deletions of phonemes occur. The phoneme string is then compared with a possibly large list of phonetically transcribed items to determine a shorter candidate list of best matching items. The candidate list is then supplied to a speech recognizer as a new vocabulary for the second recognition path. In this second step the most likely entry or entries in the list for the same speech input are determined by matching phonetic acoustic representations of the entries listed in the candidate list to the acoustic input and determining the best matching entries. This approach saves computational resources, since the phoneme recognition performed in the first step is less demanding and the computational expensive second step is performed only with a small subset of the large list of entries. A two-step speech recognition approach as discussed above is known in the art.
In the case of a speech control navigation system, the best matching items, i.e., the names of the cities that correspond to the speech input from the speaker, can then be listed to the speaker for further selection. It may happen that the first speech input was recorded at a very low quality as, by way of example, short-term noise being present in the surrounding of the speaker during the speech output. Accordingly, it may happen that in the list of the displayed names of items, the name of the item initially intended by the speaker is not contained.
The speaker then has the possibility to utter the intended name a second time hoping that this time the speech input will be correctly processed and the correct intended item is listed or directly selected.
A need therefore exists to improve the speech recognition for selecting an item from a list of items, in which the computational effort is minimized.