Noise suppression techniques in a communication systems are well known. The goal of a noise suppression system is to reduce the amount of background noise during speech coding so that the overall quality of the coded speech signal of the user is improved. Communication systems which implement speech coding include, but are not limited to, voice mail systems, cellular radiotelephone systems, trunked communication systems, airline communication systems, etc.
One noise suppression technique which has been implemented in cellular radiotelephone systems is spectral subtraction. In this approach, the audio input is divided into individual spectral bands (channel) by a suitable spectral divider and the individual spectral channels are then attenuated according to the noise energy content of each channel. The spectral subtraction approach utilizes an estimate of the background noise power spectral density to generate a signal-to-noise ratio (SNR) of the speech in each channel, which in turn is used to compute a gain factor for each individual channel. The gain factor is then used as an input to modify the channel gain for each of the individual spectral channels. The channels are then recombined to produce the noise-suppressed output waveform. An example of the spectral subtraction approach implemented in an analog cellular radiotelephone system is found in U.S. Pat. No. 4,811,404 to Vilmur, assigned to the assignee of the present application.
As stated in the aforementioned U.S. Patent, the prior art techniques of noise suppression suffer when a sudden, strong increase in background noise level occurs. To overcome the deficiencies in the prior art, the aforementioned U.S. Patent to Vilmur performs a forced update of the noise estimate regardless of the voice metric sum if M frames elapse without a background noise estimate update, where M is recommended in Vilmur to be between 50 and 300. Since a frame in Vilmur is 10 milliseconds (ms), and M is assumed to be 100, an update would occur at least once every second regardless of the voice metric sum, VMSUM (i.e., whether an update is needed or not).
To force an update of the noise estimate regardless of the voice metric can result in an attenuation of the user's speech signal despite the fact that no additional background noise is added. This in turn results in a degradation in audio quality as perceived by the end user. Furthermore, input signals other than a user's speech signal (for example, "music-on-hold") can cause problems in that the forced update of the noise estimate can occur over continuous intervals. This is due to the fact that music can span several seconds (or minutes) without sufficient pauses that would allow a normal update of the background noise estimate. The prior art would, therefore, allow a forced update every M frames because there is no mechanism to differentiate background noise from non-stationary input signals. This invalid forced update not only attenuates the input signal, but also causes severe distortion since the spectral estimate is being updated based on a time-varying, non-stationary input.
Thus, a need exists for a more accurate and reliable noise suppression system for use in communication systems.