The spectral components of an information signal are used in a number of signal processing systems including channel vocoders for communication of speech, speech recognition systems and signal enhancement filters. Since the inputs to these systems are often contaminated by noise there has been a great deal of interest in noise reduction techniques and consequently noise estimation techniques. The effect of uncorrelated noise is to add a random component to the power in each frequency band, and the subject of accurately assessing the noise content is crucial to achieve the desired end result, which is the elimination of noise from the complex signal.
Noise-free spectral components are required for optimum operation of channel vocoders. In a vocoder the input signal is filtered into a number of different frequency bands and the signal from each band is rectified (squared) and smoothed (low pass filtered). The smoothing process tends to reduce the variance of the noise. Such methods are disclosed in U.S. Pat. No. 3,431,355 to Rothauser et al and U.S. Pat. No. 3,431,355 to Schroeder. An alternative approach is disclosed in U.S. Pat. No. 3,855,423 to Brendzel et al. In this approach the level of the noise in each band is estimated from successive minima of the energy in that band and the level of the signal is estimated from successive maxima. In U.S. Pat. No. 4,000,369 to Paul et al, the noise levels are estimated in a similar fashion and subtracted from the input signals to obtain a better estimate of the speech signal in each band. This method reduces the mean value of the noise.
Another application of spectral processing is for speech filtering. Weiss et al., in "Processing Speech Signals to Attenuate Interference", presented at the IEEE Symp. Speech Recognition, April 1974, disclose a spectral shaping technique. This technique uses frequency domain processing and describes two approaches--amplitude modulation (which is equivalent to gain control) and amplitude clipping (which is equivalent to a technique called spectral subtraction). Neither the noise estimate nor the speech estimate is updated so this filter is not adaptive. An output time waveform is obtained by recombining the spectral estimates with the original phases.
An adaptive speech filter is disclosed in U.S. Pat. No 4,185,168 to Graupe and Causey, which is included by reference herein. Graupe and Causey describe a method for the adaptive filtering of a noisy speech signal based on the assumption that the noise has relatively stationary statistics compared to the speech signal.
In Graupe and Causey's method the input signal is divided into a set of signals limited to different frequency bands. The signal to noise ratio for each signal is then estimated in accordance with the time-wise variations of it's absolute value. The gain of each signal is then controlled according to an estimate of the signal to noise ratio (the gain typically being close to unity for high signal to noise ratio and less than unity for low signal to noise ratio).
Graupe and Causey describe a particular method for estimating the noise power from successive minima in the signals, and describe several methods for determining the gain as a function of the estimated noise and signal powers. This is an alternative to the method described earlier in U.S. Pat. No. 4,025,721 to Graupe and Causey, which detects the pauses between utterances in the input speech signal and updates estimates of the noise parameters during these pauses. In U.S. Pat. No. 4,025,721, Graupe and Causey describe the use of Wiener and Kalman filters to reduce the noise. These filters can be implemented in the time domain or the frequency domain.
Boll, in "Suppression of Acoustic Noise in Speech using Spectral Subtraction", IEEE Transactions on Acoustics, Speech and Signal Processing. Vol. ASSP-27, No. 2, April, 1979, describes a computationally more efficient way of doing spectral subtraction. In the spectral subtraction technique, used by Paul, Weiss and Boll, a constant or slowly varying estimate of the noise spectrum is subtracted. However, successive measurements of the noise power in each frequency bin vary rapidly and only the mean level of the noise is reduced by spectral subtraction. The residual noise will depend upon the variance of the noise power. This is true also of Weiss's spectral shaping technique where the spectral gains are constant. In Graupe's method the gain applied to each bin is continuously varied so that both the variance and the mean level of the noise can be reduced.
There are many schemes for determining the spectral gains. One scheme is described by Ephraim and Malal in "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator", IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-32, No. 6,December 1984. This describes a technique for obtaining two estimates of the signal to noise ratio--one from the input signal and one from the output signal. It does not update the estimate of the noise level. The gain is a complicated mathematical function of these two estimates, so this method is not suitable for direct implementation on a digital processor.
In U.S. Pat. No. 5,012,519 to Aldersburg et al the gain estimation technique of Ephraim and Malah is combined with the noise parameter estimation method disclosed in U.S. Pat. No. 4,025,721 to Graupe and Causey to provide a fully adaptive system. The mathematical function of Ephraim and Malah is replaced with a two-dimensional lookup table to determine the gains. However, since the estimates of the signal to noise ratio can vary over a very large range, this table requires a large amount of expensive processor time and memory. Aldersburg et al use a separate voice detection system on the input signal which requires significant additional processing time.
There is thus an unmet need in the art to be able to utilize an efficient adaptive signal processing technique for the accurate and fast identification of noise. Processing time and memory efficiency would be improved if the noise estimates were only done during pauses of the information signal, so that noise estimates arc updated only when an information signal is not detected. The algorithm should be capable of being implemented on inexpensive digital signal processors.