The present invention pertains to techniques for reducing noise in a signal. More particularly, the present invention relates to techniques for reducing noise in a signal representing speech.
In a speech recognition system, the presence of noise in an input speech signal can degrade recognition accuracy. Noise can be introduced from many different sources and may be introduced through either acoustic coupling or electrical coupling. Acoustic coupling of noise into a speech signal might occur when a speaker is located in a noisy setting, such as a busy office environment. Electrical coupling of noise may result from electromagnetic radiation emitted by electrical devices in proximity to components of the speech recognition system. Various techniques are known for reducing noise in a speech signal. However, these techniques generally are not adaptable, or are not sufficiently adaptable, to the amount of noise in the speech signal at any given time. A typical consequence of this shortcoming is that a given noise reduction technique may perform adequately in a noisy environment but perform poorly in a low-noise environment. Such techniques, therefore, tend not to be very flexible in terms of handling signals under a variety of conditions. In addition, prior art noise reduction techniques may not be capable of operating upon individual frequency components of a signal. Furthermore, many noise reduction techniques used in speech recognition handle the beginning of a sentence very inefficiently, since few samples have been observed at that point in time, and a signal-to-noise ratio is difficult to estimate accurately at that time.
The present invention includes a method and apparatus for reducing noise in data representing an audio signal, such as a speech signal. For each of multiple frequency components of the audio signal, a spectral magnitude of the data and an estimate of noise in the data are computed. The estimate of noise is scaled by a noise scale factor that is a function of the corresponding frequency component, to produce a scaled noise estimate. The scaled noise estimate is subtracted from the spectral magnitude to produce cleaned audio data. The scale factor may be also be a function of an absolute noise level for each frequency component, a signal-to-noise ratio for each frequency component, or both.
The noise reduction technique operates well on both noisy input signals and clean input signals. The technique can be implemented as a real-time method of reducing noise in a speech signal, which begins the noise reduction process immediately when the speech signal becomes available, thereby reducing undesirable delay between production of the speech signal and its recognition by a speech recognizer.
Other features of the present invention will be apparent from the accompanying drawings and from the detailed description which follows.