The present invention relates generally to communication apparatus. More particularly, it relates to techniques for suppressing noise in a speech signal, and which may be used in a wireless or mobile communication device such as a cellular phone.
In many applications, a speech signal is received in the presence of noise, processed, and transmitted to a far-end party. One example of such a noisy environment is wireless application. For many conventional cellular phones, a microphone is placed near a speaking user's mouth and used to pick up speech signal. The microphone typically also picks up background noise, which degrades the quality of the speech signal transmitted to the far-end party.
Newer-generation wireless communication devices are designed with additional capabilities. Besides supporting voice communication, a user may be able to view text or browse World Wide Web page via a display on the wireless device. New videophone service requires the user to place the phone away, which therefore requires “far-field” speech pick-up. Moreover, “hands-free” communication is safer and provides more convenience, especially in an automobile. In any case, the microphone in the wireless device may be used in a “far-field” mode whereby it may be placed relatively far away from the speaking user (instead of being pressed against the user's ear and mouth). For far-field communication, less signal and more noise are received by the microphone, and a lower signal-to-noise ratio (SNR) is achieved, which typically leads to poor signal quality.
One common technique for suppressing noise is the spectral subtraction technique. In a typical implementation of this technique, speech plus noise is received via a single microphone and transformed into a number of frequency bins via a fast Fourier transform (FFT). Under the assumption that the background noise is long-time stationary (in comparison with the speech), a model of the background noise is estimated during time periods of non-speech activity whereby the measured spectral energy of the received signal is attributed to noise. The background noise estimate for each frequency bin is utilized to estimate an SNR of the speech in the bin. Then, each frequency bin is attenuated according to its noise energy content with a respective gain factor computed based on that bin's SNR.
The spectral subtraction technique is generally effective at suppressing stationary noise components. However, due to the time-variant nature of the noisy environment (e.g., street, airport, restaurant, and so on), the models estimated in the conventional manner using a single microphone are likely to differ from actuality. This may result in an output speech signal having a combination of low audible quality, insufficient reduction of the noise, and/or injected artifacts.
Another technique for suppressing noise is with a microphone array. For this technique, multiple microphones are arranged typically in a linear or some other type of array. An adaptive or non-adaptive method is then used to process the signals received from the microphones to suppress noise and improve speech SNR. However, the microphone array has not been applied to mobile communication devices since it generally require certain size and cannot be fit into the small form factor of current mobile devices.
Conventional wireless communication devices such as cellular phones typically utilize a single microphone to pick up speech signal. The single microphone design limits the type of signal processing that may be performed on the received signal, and may further limit the amount of improvement (i.e., the amount of noise suppression) that may be achievable. The single microphone design is also ineffective at suppressing noise in far-field application where the microphone is placed at a distance (e.g., a few feet) away from the speech source.
As can be seen, techniques that can be used to suppress noise in a speech signal in a wireless environment are highly desirable.