Cellular telephones, speaker phones, and various other communication devices utilize background noise suppression to enhance the quality of a received signal. In particular, the presence of acoustic background noise can substantially degrade the performance of a speech communication system. The problem is exacerbated when a narrow-band speech coder is used in the communication link since such coders are tuned to specific characteristics of clean speech signals and handle noisy speech and background noise rather poorly.
A simplified block diagram of the basic noise suppression system 100 is shown in FIG. 1. Such a system is typically utilized to attenuate the input speech/noise signal when signal-to-noise (SNR) values are low. As shown, system 100 includes fast Fourier transformer (FFT) 101, and inverse FFT 102, total channel energy estimator 103, noise energy estimator 105, SNR estimator 106, and channel gain generator 104. During operation, the input signal (comprised of speech plus noise) is transformed into the frequency domain by FFT 101 and grouped into channels that are similar to critical bands of hearing. The channel signal energies are computed via estimator 103, and the background noise channel energies are conditionally updated via estimator 105 as a function of the spectral distance between the signal energy and noise energy estimates. From these energy estimates, the channel SNR vector is computed by estimator 106, which is then used to determine the individual channel gains. The channel gains are then applied via a mixer to the original complex spectrum of the input signal and inverse transformed, using the overlap-and-add method, to produce the noise suppressed output signal. As discussed above, when SNR values are estimated to be low, attenuation of the FFT signal takes place.
FIG. 2 shows the basic gain as a function of SNR for prior-art systems. From FIG. 2 it can be seen that for low channel SNR (i.e., less than an SNR threshold), the signal is presumed to be noise, and the gain for that channel is set to the minimum (in this case, −13 dB). As the SNR increases past the SNR threshold, the gain function enters a transition region, where the gain follows a constant slope of approximately 1, meaning that for every dB increase in SNR, the gain is increased by 1 dB. As the SNR is increased further (generally speech) the gain is clamped at 0 dB so as not to increase the power of the input signal. This gain function is representative of each channel of the communication system such that it is possible to have the gain in one channel be 0 dB while it can be −13 dB in another.
Although the above technique does serve to reduce the background noise, it was observed that background noise could produce annoying artifacts when entering the transition region of the gain curve since background noise will have short-term SNR fluctuations around the 0 dB origin since the channel noise energy estimator smoothes the energy via low-pass filtering. As a result, the channel energy estimate moves quicker than the respective noise energy estimate, and the short-term fluctuations in SNR (and subsequently, gain) cause “waterfall” or “swirling” artifacts. To circumvent this problem, prior-art techniques have proposed a method by which the channel SNR estimate is modified to include a process that 1) detects spurious activity in the transition region, and 2) sets the channel SNR back to zero when the signal is spurious. This method is illustrated in FIG. 3.
A problem exists in that in order to detect that a channel SNR is “spurious”, it is required that only “some” of the channel SNRs enter into the transition region. This is fine for stationary noises that have uncorrelated frequency components (e.g., wind noise in a car), but in cases where the frequency components are correlated (e.g., office noise, interfering talkers, impulsive noise, etc.), the method cannot discriminate between non-stationary background noise and speech.
More recent efforts to improve Noise Suppression performance have focused on a “variable attenuation” concept. In order to alleviate these unpleasant effects, the algorithm was modified to adaptively reduce the amount of noise reduction during severe SNR conditions. FIG. 4 shows the modified channel gain function, and how the gain changes relative to the instantaneous SNR for each channel. For this method, the overall long-term peak SNR dictates the minimum amount of gain applied to the noise component of the signal. A constant SNR threshold is used and the gain slope is varied to intersect the 0 dB gain axis at the same channel SNR. The minimum gains are also clamped to be variable only between −9 and −13 dB.
While this method has proven to be effective in low SNR environments, it does not address the ongoing problem of non-stationary, impulsive type noises. Thus a need exists to improve performance of prior-art noise suppression systems for non-stationary noises, while maximizing the benefits associated with the variable attenuation concept.