Technical Field
The invention relates to noise reduction in processing speech signals.
More specifically, the invention relates to using adaptive filters to extract speech information from a speech signal containing noise.
Description of the Related Art
Automatic speech recognition systems (“ASR”) convert audio signals containing spoken words to text. The “front ends” of such systems initiate the conversion process by extracting critical identifying speech “features” from a targeted speech signal. The feature-extraction performance of ASR systems degrades significantly when the targeted speech signal is corrupted by noise. Indeed, noise hinders the widespread use of ASR systems in many otherwise practical applications. The same is true of any other communication or auditory system which employs the spoken word as input and processes that signal for the purpose of making it more clearly heard or understood, such as hearing aids, head phones, or radio, wire or internet-based voice communications.
Current noise-reduction systems attempt to mitigate noise by modeling it and subtracting it from the signal. These systems require an accurate estimation of the noise signal. However, accurate estimation is extremely difficult because the noise signals are non-stationary and these techniques fail or limit their effectiveness when the noise is different from the model or if the noise varies over time.
Other methods rely on training models that attempt to train an ASR system to recognize noise-corrupted speech. However, the magnitude of environmental noise and system noise often is too large or too dynamic to produce training models having requisite reliability.
Finally, others have attempted to utilize the harmonic nature of speech to improve speech recognition. However, prior attempts to detect and track the harmonic structure of speech have been inadequate.