1. Field
The disclosure relates generally to the field of speech recognition systems and methods, and, in particular, to speech recognition systems and methods having improved speech recognition.
2. Background
Speech recognition (SR) (also commonly referred to as voice recognition) represents one of the most important techniques to endow a machine with simulated intelligence to recognize user or user-voiced commands and to facilitate human interface with the machine. SR also represents a key technique for human speech understanding. Systems that employ techniques to recover a linguistic message from an acoustic speech signal are called voice recognizers. The term “speech recognizer” is used herein to mean generally any spoken-user-interface-enabled device or system.
The use of SR is becoming increasingly important for safety reasons. For example, SR may be used to replace the manual task of pushing buttons on a wireless telephone keypad. This is especially important when a user is initiating a telephone call while driving a car. When using a phone without SR, the driver must remove one hand from the steering wheel and look at the phone keypad while pushing the buttons to dial the call. These acts increase the likelihood of a car accident. A speech-enabled phone (i.e., a phone designed for speech recognition) would allow the driver to place telephone calls while continuously watching the road. In addition, a hands-free car-kit system would permit the driver to maintain both hands on the steering wheel during call initiation.
Speech recognition (ASR) systems, such as always-on speech recognition (ASR) systems, have difficulty handling ambient noise, such as background conversation or other undesired noise. The presence of background conversation, for instance, may cause the system to recognize a command that was not intended by the user of the system, thus leading to a number of false positives and misrecognitions. For example, in a car environment, while the driver might be the issuer of voice commands to control the ASR system, the presence of passengers and ensuing conversation between them can greatly reduce the performance of the ASR system as their conversations may lead to commands that are false positives or misrecognitions.