Field of the Technology
The present technology relates generally to audio processing, and more particularly to adaptive noise reduction of an audio signal.
Description of Related Art
Currently, there are many methods for reducing background noise within an acoustic signal in an adverse audio environment. One such method is to use a stationary noise suppression system. The stationary noise suppression system will always provide an output noise that is a fixed amount lower than the input noise. Typically, the noise suppression is in the range of 12-13 decibels (dB). The noise suppression is fixed to this conservative level in order to avoid producing speech loss distortion, which will be apparent with higher noise suppression.
In order to provide higher noise suppression, dynamic noise suppression systems based on signal-to-noise ratios (SNR) have been utilized. This SNR may then be used to determine a suppression value. Unfortunately, SNR, by itself, is not a very good predictor of speech distortion due to existence of different noise types in the audio environment. SNR is a ratio of how much louder speech is than noise. However, speech may be a non-stationary signal which may constantly change and contain pauses. Typically, speech energy, over a period of time, will include a word, a pause, a word, a pause, and so forth. Additionally, stationary and dynamic noises may be present in the audio environment. The SNR averages all of these stationary and non-stationary speech and noise and determines a ratio based on what the overall level of noise is. There is no consideration as to the statistics of the noise signal.
In some prior art systems, an enhancement filter may be derived based on an estimate of a noise spectrum. One common enhancement filter is the Wiener filter. Disadvantageously, the enhancement filter is typically configured to minimize certain mathematical error quantities, without taking into account a user's perception. As a result, a certain amount of speech degradation is introduced as a side effect of the signal enhancement which suppress noise. For example, speech components that are lower in energy than the noise typically end up being suppressed by the enhancement filter, which results in a modification of the output speech spectrum that is perceived as speech distortion. This speech degradation will become more severe as the noise level rises and more speech components are attenuated by the enhancement filter. That is, as the SNR gets lower, typically more speech components are buried in noise or interpreted as noise, and thus there is more resulting speech loss distortion. This introduces more speech loss distortion and speech degradation.
Therefore, it is desirable to be able to provide adaptive noise reduction that balances the tradeoff between speech loss distortion and residual noise.