1. Field of Invention
This invention relates to a method and apparatus for automatic speech recognition.
2. Description of Related Art
Mobile device usage has increased as mobile devices can store more information and as more information can be accessed over networks. However, conventional input methods for mobile devices such as web-enabled phones, personal communication systems, handheld personal digital assistants and other mobile devices are limited. For example, the size of keyboards on mobile devices is limited due to the need to make the mobile device as small and compact as possible.
Conventional limited size keyboards typically use multi-function keys to further reduce size and space requirements. Multi-function keys are keys that depend on the selection of previous key sequences. Multi-function keys can be used to perform many different functions. However, as the number of additional functions increases, multi-function keyboards become difficult to use and the input method becomes error-prone. Decreasing the size of keyboards with multi-function keys further increases the likelihood of mis-keying due to the smaller key size. Thus, decreased size multi-function keys are also error-prone and difficult to use. Some manufacturers have attempted to address these problems with the use of predictive text entry input methods. For example, a type of predictive text entry system used in many web-enabled phones attempts to predict complete words as the keystrokes for each word entered. However, this predictive text entry system mis-identifies words, is not easily adapted to words in different languages, requires the use of a keyboard and is not easy to use.
Some manufacturers of mobile devices have attempted to address keyboard input problems by increasing the size of the mobile device keyboard. For example, the Ericsson model R380 and R380s web-enabled phones are equipped with a flip-up keypad that reveals a larger touch sensitive screen for input functions. However, these touch sensitive screens are expensive, increase the likelihood of damage to the device, increase power requirements and therefore battery size, and fail to provide the user with an input method that is easy to use.
Some personal digital assistant device manufacturers such as Palm and Handspring have attempted to address these limitations of conventional input methods by adding handwriting recognition software to mobile devices such as personal digital assistants. However, handwriting recognition software is also error-prone, requires that the user be trained to write in ways easily recognizable by the handwriting recognition software and fails to provide an input method that is easy to use.
Automatic speech recognition provides an easy-to-use input method for mobile devices. However, some conventional speech recognition systems for mobile devices provide speech recognition tailored for one specific device or as voice dialing systems and may require user intervention such as training. If the user must replace a lost or damaged device with a new device, the new device must be retrained before use or the accuracy of the device is lessened. Also, as the user's usage environment deviates from the training environment, the accuracy of these conventional voice recognition systems is affected. Moreover, the size, power and space limitations of mobile devices also limit the size, complexity and power of the voice recognizer.
Other conventional speech recognition systems use speaker-independent models either in the device or in the network. However, these conventional speaker-independent speech recognition devices do not automatically compensate for changing environments and/or differing transducer response characteristics.
For example, each model of phone is likely to use a transducer with different response characteristics. The response characteristics associated with a head-mounted transducer or microphone used in a home office environment is likely to differ from the response characteristics of a Jabra hands-free EarSet® microphone used by a hands-free mobile phone user. Conventional speech recognition systems assume each mobile device has the same response characteristics, with the result that the accuracy of the speech recognition is reduced.
Similarly, for background noise, a user in a home office environment will experience a quiet and predictable background noise environment. In contrast, a mobile phone user will likely experience constantly changing and frequently noisy background noise environments. Conventional speech recognition systems assume each mobile device experiences the same background noise, resulting in reduced accuracy of the speech recognition system.