Audio signal processors typically process audio signals to modify characteristics of the audio signals. Typical applications of audio signal processors include filtering, speech compression, echo cancelling, noise suppression and energy estimation. Audio signal processors are frequently used in communication systems for processing speech. Audio signal processors advantageously provides improved audio quality, and efficient transmission and storage of speech.
Acoustic noise suppression in a speech communication system generally serves the purpose of improving the overall quality of the desired audio signal by filtering environmental background noise from the desired speech signal. This speech enhancement process is particularly necessary in environments having abnormally high levels of ambient background noise, such as an aircraft, a moving vehicle, or a noisy factory.
One noise suppression technique is a spectral subtraction--or a spectral gain modification--technique. Using this approach, the audio input signal is divided into individual spectral bands by a bank of bandpass filters, and particular spectral bands are attenuated according to their noise energy content. A spectral subtraction noise suppression prefilter utilizes an estimate of the background noise power spectral density to generate a signal-to-noise ratio (SNR) of the speech in each channel, which, in turn, is used to compute a gain factor for each individual channel. The gain factor is used as the attenuation for that particular spectral band. The channels are then attenuated and recombined to produce the noise-suppressed output waveform.
In specialized applications involving relatively high background noise environments, most noise suppression techniques exhibit significant performance limitations. One example of such an application is the vehicle speakerphone option to a cellular mobile radio telephone system, which provides hands-free operation for the automobile driver. The mobile hands-free microphone is typically located at a greater distance from the user, such as being mounted overhead on the visor. The more distant microphone delivers a much poorer signal-to-noise ratio to the land-end party due to road and wind noise conditions. Although the received speech at the land-end is usually intelligible, continuous exposure to such background noise levels often increases listener fatigue.
In rapidly-changing high noise environments, a severe low frequency noise flutter develops in the output speech signal which resembles a distant "jet engine roar" sound. This noise flutter is inherent in a spectral subtraction noise suppression system, since the individual channel gain parameters are continuously being updated in response to the changing background noise environment.
The background noise flutter problem was indirectly addressed but not eliminated through the use of gain smoothing. Unfortunately, excessive gain smoothing still produces noticeable detrimental effects in voice quality, the primary effect being the apparent introduction of a tail-end echo or "noise pump" to spoken words. There is also significant reduction in voice amplitude with large amounts of gain smoothing.
The noise flutter performance was further improved by the technique of smoothing the noise suppression gain factors for each individual channel on a per-sample basis instead of on a per-frame basis. However, this technique did not appreciate that the primary source of the channel gain discontinuities is the inherent fluctuation of background noise in each channel from one frame to the next. In known spectral subtraction systems, even a 2 dB SNR variation would create a few dB of gain variation, which is then heard as an annoying background noise flutter.
U.S. Pat. No. 4,811,404 addressed the noise flutter problem with an improved noise suppression system which performs speech quality enhancement upon the speech-plus-noise signal available at the input to generate a clean speech signal at the output by spectral gain modification. The improvements included the addition of a signal-to-noise ratio (SNR) threshold mechanism to reduce background noise flutter by offsetting the gain rise of the gain tables until a certain SNR threshold is reached, the use of a voice metric calculator to produce a more accurate background noise estimates via performing the update decision based on the overall voice-like characteristics in the channels and the time interval since the last update, and the use of a channel SNR modifier to provide immunity to narrow band noise bursts through modification of the SNR estimates based on the voice metric calculation and the channel energies.
However, the solution proposed by U.S. Pat. No. 4,811,404 introduces a trade-off between voice quality and noise suppression. When the gain rise has a relatively low offset, the noise suppression is improved but the voice quality is degraded. When the gain rise has a relatively high offset, the noise suppression is degraded but the voice quality is improved.
Moreover, energy estimators estimate the energy in a signal. The energy estimate of the signal is typically used in audio signal processors. Typical applications of the energy estimate in signal processors includes signal detection, noise estimation, and signal to noise ratio estimation.
A common problem in estimating the energy of a noise signal is that the variance of the energy estimate from one estimate to another is large.
One solution to the problem involves filtering the energy estimate. However, using filters to achieve acceptable variance of the energy estimate when estimating noise are generally complicated in that they perform computationally intensive steps.
Another solution to the problem involves estimating the energy of the signal over long periods of time. However, the response time of this technique is slow.
Accordingly, there is a need for an improved noise suppression system and method therefor which addresses the problem of the trade-off between voice quality and noise suppression of the prior art.