1. Field of the Invention
The present invention is generally in the field of speech processing. More specifically, the invention is in the field of noise suppression for speech coding and speech recognition.
2. Related Art
Presently there are a number of approaches for reducing background noise (also referred to as “noise suppression”) from a source signal. As is known in the art, noise suppression is an important feature for improving the performance of speech coding and/or speech recognition systems. Noise suppression offers a number of benefits, including suppressing the background noise so that the party at the receiving side can hear the caller better, improving speech intelligibility, improving echo cancellation performance, and improving performance of automatic speech recognition (“ASR”), among others.
Spectral subtraction is a known method for noise suppression, and is based on the assumption that a source signal, x(t), is composed of a clean speech signal, s(t), in addition to a noise signal, n(t), that is stationary and uncorrelated with the clean speech signal, as given by:x(t)=s(t)+n(t)  (Equation 1).
The noise subtraction is processed in the frequency domain using the short-time Fourier transform. It is assumed that the noise signal is estimated from a signal portion consisting of pure noise. Then, the short time clean speech spectrum, |Ŝ(m,k)|, can be estimated by subtracting the short-time noise estimate, |{circumflex over (N)}(m,k)|, from the short-time noisy speech spectrum, |X(m,k)|, as given by:|{circumflex over (S)}(m,k)|=|X(m,k)|−|{circumflex over (N)}(m,k)|  (Equation 2).
The noise-reduced speech signal, Ŝ(m,k), is then re-synthesized using the original phase spectrum of the source signal. This simple form of spectral subtraction produces undesired signal distortions, such as “running water” effect and “musical noise,” if the noise estimate is either too low or too high. It is possible to eliminate the musical noise by subtracting more than the average noise spectrum. This leads to the Generalized Spectral Subtraction (“GSS”) method, which is given by:|{circumflex over (S)}(m,k)|=X(m,k)|−α|{circumflex over (N)}(m,k)|  (Equation 3).
In addition, to avoid negative estimates of speech, the negative magnitudes are sometimes replaced by zeros or by a spectral as given by:|{circumflex over (S)}(m,k)|=max(|X(m,k)|−α|{circumflex over (N)}(m,k)|,β|X(m,k)|)  (Equation 4).
It is possible to suppress unwanted noise effectively with GSS by using a very large value for α; however, the speech sounds will be muffled and intelligibility will be lost. Accordingly, there exists a strong need in the art for a computationally efficient background noise suppressor for speech coding and speech recognition, which suppresses unwanted noise effectively while maintaining reasonable high intelligibility.