The present invention relates generally to signal processing. More particularly, it relates to techniques for suppressing noise in a speech signal, which may be used, for example, in an automobile.
In many applications, a speech signal is received in the presence of noise, processed, and transmitted to a far-end party. One example of such a noisy environment is the passenger compartment of an automobile. A microphone may be used to provide hands-free operation for the automobile driver. The hands-free microphone is typically located at a greater distance from the speaking user than with a regular hand-held phone (e.g., the hands-free microphone may be mounted on the dash board or on the overhead visor). The distant microphone would then pick up speech and background noise, which may include vibration noise from the engine and/or road, wind noise, and so on. The background noise degrades the quality of the speech signal transmitted to the far-end party, and degrades the performance of automatic speech recognition device.
One common technique for suppressing noise is the spectral subtraction technique. In a typical implementation of this technique, speech plus noise is received via a single microphone and transformed into a number of frequency bins via a fast Fourier transform (FFT). Under the assumption that the background noise is long-time stationary (in comparison with the speech), a model of the background noise is estimated during time periods of non-speech activity whereby the measured spectral energy of the received signal is attributed to noise. The background noise estimate for each frequency bin is utilized to estimate a signal-to-noise ratio (SNR) of the speech in the bin. Then, each frequency bin is attenuated according to its noise energy content via a respective gain factor computed based on that bin's SNR.
The spectral subtraction technique is generally effective at suppressing stationary noise components. However, due to the time-variant nature of the noisy environment, the models estimated in the conventional manner using a single microphone are likely to differ from actuality. This may result in an output speech signal having a combination of low audible quality, insufficient reduction of the noise, and/or injected artifacts.
As can be seen, techniques that can suppress noise in a speech signal, and which may be used in a noisy environment, particularly in an automobile, are highly desirable.