Voice recognition engines are used to translate spoken words into text, in order to execute desired user commands. Example voice recognition applications include voice dialling, call routing, home appliance control, in-vehicle applications, search, data entry, preparation of structured documents, and speech-to-text processing (e.g., word processors or emails).
With new laws banning driving and handling of electronic devices, using voice activated dialling (e.g. over a Bluetooth™ headset) has become more common. End-user experience is shaped by the ability of the voice recognition engine to accurately resolve the commands and the contact referenced. A major challenge of voice recognition engines is dealing with phonetic variations associated with names input in different origin languages and end-user accents. The detection accuracy problem is further amplified when the audio path is extended, for example going through the Bluetooth™ headset microphone instead of the resident microphone of the handheld phone.
Some conventional voice recognition engines are trained merely by having the user read a known paragraph at setup time.
Additional difficulties with some existing systems may be appreciated in view of the detailed description below.
Like reference numerals are used throughout the Figures to denote similar elements and features.