The present invention relates to communications systems, and more particularly, to methods and apparatus for mitigating the effects of disruptive background noise components in communications signals.
Today, technology and consumer demand have produced mobile telephones of diminishing size. As the mobile telephones are produced smaller and smaller, the placement of the microphone during use ends up more and more distant from the speaker""s (near-end user""s) mouth. This increased distance increases the need for speech enhancement due to disruptive background noise being picked up at the microphone and transmitted to a far-end user. In other words, since the distance between a microphone and a near-end user is larger in the newer smaller mobile telephones, the microphone picks up not only the near-end user""s speech, but also any noise which happens to be present at the near-end location. For example, the near-end microphone typically picks up sounds such as surrounding traffic, road and passenger compartment noise, room noise, and the like. The resulting noisy near-end speech can be annoying or even intolerable for the far-end user. It is thus desirable that the background noise be reduced as much as possible, preferably early in the near-end signal processing chain (e.g., before the received near-end microphone signal is supplied to a near-end speech coder).
As a result of interfering background noise, some telephone systems include a noise reduction processor designed to eliminate background noise at the input of a near-end signal processing chain. FIG. 1 is a high-level block diagram of such a system 100. In FIG. 1, a noise reduction processor 110 is positioned at the output of a microphone 120 and at the input of a near-end signal processing path (not shown). In operation, the noise reduction processor 110 receives a noisy speech signal x from the microphone 120 and processes the noisy speech signal x to provide a cleaner, noise-reduced speech signal SNR which is passed through the near-end signal processing chain and ultimately to the far-end user.
One well known method for implementing the noise reduction processor 110 of FIG. 1 is referred to in the art as spectral subtraction. See, for example, S. F. Boll, xe2x80x9cSuppression of Acoustic Noise in Speech using Spectral Subtractionxe2x80x9d, IEEE Trans. Acoust. Speech and Sig. Proc., 27:113-120, 1979, which is incorporated herein by reference in its entirety. Generally, spectral subtraction uses estimates of the noise spectrum and the noisy speech spectrum to form a signal-to-noise ratio (SNR) based gain function which is multiplied by the input spectrum to suppress frequencies having a low SNR. Though spectral subtraction does provide significant noise reduction, it suffers from several well known disadvantages. For example, the spectral subtraction output signal typically contains artifacts known in the art as musical tones. Further, discontinuities between processed signal blocks often lead to diminished speech quality from the far-end user perspective.
Many enhancements to the basic spectral subtraction method have been developed in recent years. See, for example, N. Virage, xe2x80x9cSpeech Enhancement Based on Masking Properties of the Auditory System,xe2x80x9d IEEE ICASSP. Proc. 796-799 vol. 1, 1995; D. Tsoukalas, M. Paraskevas and J. Mourjopoulos, xe2x80x9cSpeech Enhancement using Psychoacoustic Criteria,xe2x80x9d IEEE ICASSP. Proc., 359-362 vol. 2, 1993; F. Xie and D. Van Compernolle, xe2x80x9cSpeech Enhancement by Spectral Magnitude Estimationxe2x80x94A Unifying Approach,xe2x80x9d IEEE Speech Communication, 89-104 vol. 19, 1996; R. Martin, xe2x80x9cSpectral Subtraction Based on Minimum Statistics,xe2x80x9d UESIPCO, Proc., 1182-1185 vol. 2, 1994; and S. M. McOlash, R. J. Niederjohn and J. A. Heinen, xe2x80x9cA Spectral Subtraction Method for Enhancement of Speech Corrupted by Nonwhite, Nonstationary Noise,xe2x80x9d IEEE IECON. Proc., 872-877 vol. 2, 1995.
More recently, spectral subtraction has been implemented using correct convolution and spectrum dependent exponential gain function averaging. These techniques are described in co-pending U.S. patent application Ser. No. 09/084,387, filed May 27, 1998 and entitled xe2x80x9cSignal Noise Reduction by Spectral Subtraction using Linear Convolution and Causal Filteringxe2x80x9d and co-pending U.S. patent application Ser. No. 09/084,503, also filed May 27, 1998 and entitled xe2x80x9cSignal Noise Reduction by Spectral Subtraction using Spectrum Dependent Exponential Gain Function Averaging.xe2x80x9d
Spectral subtraction uses two spectrum estimates, one being the xe2x80x9cdisturbedxe2x80x9d signal and one being the xe2x80x9cdisturbingxe2x80x9d signal, to form a signal-to-noise ratio (SNR) based gain function. The disturbed spectra is multiplied by the gain function to increase the SNR for this spectra. In single microphone spectral subtraction applications, such as used in conjunction with hands-free telephones, speech is enhanced from the disturbing background noise. The noise is estimated during speech pauses or with the help of a noise model during speech. This implies that the noise must be stationary to have similar properties during the speech or that the model be suitable for the moving background noise. Unfortunately, this is not the case for most background noises in every-day surroundings.
Therefore, there is a need for a noise reduction system which uses the techniques of spectral subtraction and which is suitable for use with most every-day variable background noises.
The present invention fulfills the above-described and other needs by providing methods and apparatus for performing noise reduction by spectral subtraction in a dual microphone system. According to exemplary embodiments, when a far-mouth microphone is used in conjunction with a near-mouth microphone, it is possible to handle non-stationary background noise as long as the noise spectrum can continuously be estimated from a single block of input samples. The far-mouth microphone, in addition to picking up the background noise, also picks us the speaker""s voice, albeit at a lower level than the near-mouth microphone. To enhance the noise estimate, a spectral subtraction stage is used to suppress the speech in the far-mouth microphone signal. To be able to enhance the noise estimate, a rough speech estimate is formed with another spectral subtraction stage from the near-mouth signal. Finally, a third spectral subtraction stage is used to enhance the near-mouth signal by suppressing the background noise using the enhanced background noise estimate. A controller dynamically determines any or all of a first, second, and third subtraction factor for each of the first, second, and third spectral subtraction stages, respectively.