The present invention provides a noise suppression technique suitable for use as a front end to a low-bitrate speech coder. The inventive technique is particularly suitable for use in cellular telephony applications.
The following prior art documents provide technological background for the present invention:
"ENHANCED VARIABLE RATE CODEC, SPEECH SERVICE OPTION 3 FOR WIDEBAND SPREAD SPECTRUM DIGITAL SYSTEMS," TIA/EIA/IS-127 Standard. PA0 "THE STUDY OF SPEECH/PAUSE DETECTORS FOR SPEECH ENHANCEMENT METHODS," P. Sovka and P. Pollak, Eurospeech 95 Madrid, 1995, p. 1575-1578. PA0 "SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR," Y. Ephraim, D. Malah, IEEE Transactions on Acoustics Speech and Signal Processing, Vol. ASSP-32, No. 6, December 1984, pp. 1109-1121. PA0 "SUPPRESSION OF ACOUSTIC NOISE USING SPECTRAL SUBTRACTION," S. Boll, IEEE Transactions on Acoustics Speech and Signal Processing, Vol. ASSP-27, No. 2, April, 1979, pp. 113-120. PA0 "STATISTICAL-MODEL-BASED SPEECH ENHANCEMENT SYSTEMS," Proceedings of the IEEE, Vol. 80, No. 10, October 1992, pp. 1526-1544.
A low complexity approach to noise suppression is spectral modification (also known as spectral subtraction). Noise suppression algorithms using spectral modification first divide the noisy speech signal into several frequency bands. A gain, typically based on an estimated signal-to-noise ratio in that band, is computed for each band. These gains are applied and a signal is reconstructed. This type of scheme must estimate signal and noise characteristics from the observed noisy speech signal. Several implementations of spectral modification techniques can-be found in U.S. Pat. Nos. 5,687,285; 5,680,393; 5,668,927; 5,659,622; 5,651,071; 5,630,015; 5,625,684; 5,621,850; 5,617,505; 5,617,472; 5,602,962; 5,577,161; 5,555,287; 5,550,924; 5,544,250; 5,539,859; 5,533,133; 5,530,768; 5,479,560; 5,432,859; 5,406,635; 5,402,496; 5,388,182; 5,388,160; 5,353,376; 5,319,736; 5,278,780; 5,251,263; 5,168,526; 5,133,013; 5,081,681; 5,040,156; 5,012,519; 4,908,855; 4,897,878; 4,811,404; 4,747,143; 4,737,976; 4,630,305; 4,630,304; 4,628,529; and 4,468,804.
Spectral modification has several desirable properties. First, it can be made to be adaptive and hence can handle a changing noise environment. Second, much of the computation can be performed in the discrete Fourier transform (DFT) domain. Thus, fast algorithms (like the fast Fourier transform (FFT)) can be used.
There are, however, several shortcomings in the current state of the art. These include:
(i) objectionable distortion of the desired speech signal in moderate to high noise levels (such distortions have several causes, some of which are detailed below); and PA1 (ii) excessive computational complexity.
It would be advantageous to provide a noise suppression technique that overcomes the disadvantages of the prior art. In particular, it would be advantageous to provide a noise suppression technique that accounts for time-domain discontinuities typical in block based noise suppression techniques. It would be further advantageous to provide such a technique that reduces distortion due to frequency-domain discontinuities inherent in spectral subtraction. It would be still further advantageous to reduce the complexity of spectral shaping operations in providing noise suppression, and to increase the reliability of estimated noise statistics in a noise suppression technique.
The present invention provides a noise suppression technique having these and other advantages.