Automatic Speech Recognition (ASR) is used to augment or replace computer keyboards, telephone keypads, mice, and related devices to control computer applications. For example, a common application of ASR allows users to control a remote server by speaking commands through a telephone. The user may be able to request information from and/or modify a remote database by voice alone or in combination with the telephone keypad.
The effectiveness of these systems is governed by the number of distinct words they can recognize accurately. As the vocabulary associated with a particular system grows, so does recognition difficulty. Recognition accuracy is further affected by variations in pronunciation among speakers. The combination of recognized words and information about various expected pronunciations is called an ASR “grammar.”
Proper names of people are especially difficult to recognize. The number of possible names is limitless and the pronunciations may vary significantly depending upon the origin of the name, the language being spoken, and the regional dialect or native language of the person speaking. For example, “Rzegocki”—of Polish origin and properly pronounced “sha-guts-ki”—is baffling to most American English-speakers. It requires an uncommon knowledge of the Polish language and rules of transliteration for Polish diacritical markings not used in English. As such, mispronunciations such as “are-ze-gockee” are common, making ASR systems significantly less effective. This problem is further compounded by global migration and commerce. A native Japanese or Hindi speaker, interacting with an American English-based ASR system would further complicate the recognition of this and other “non-native” names.
Despite this complexity, the ability to control the recording and transmission of voicemails and other aspects of the operation of modern communication systems through verbal commands greatly simplifies use. As a result, the recognition of proper names is highly desirable for phone-based and other appropriate forms communication applications. For example, addressing a voicemail message, transferring a call, retrieving contact information, or requesting an appointment all depend upon accurate recognition of the name of the person being addressed.