Mobile voice communications products are used in a variety of environments, many of which can be extremely noisy. Background noise masks the desired speech signal and reduces the intelligibility of the speech in both the sending and receiving environments. Many mobile voice communications products contain processing components that attempt to mitigate the effect of the noise on the speech signal. On the uplink transmit input side many products employ some type of noise suppression system to clean up a noisy speech signal before any coding or modulation is employed. Suppressing the noise improves the performance of a codec or modulator. Currently, many different noise suppression methods are used in voice communications products. Many are based on the IS-127 specified algorithm incorporated in the TIA/EIA-IS-127 standard EVRC codec (TIA/EIA/IS-127, “Enhanced Variable Rate Codec, Speech Service Option 3 for Wideband Spread Spectrum Digital Systems”, July 1996), or on variations of it. The IS-127 noise suppressor belongs to the class of single input spectral subtraction noise suppressors in which an estimate of the spectral energy characteristics of the background noise is used to remove noise from the noisy speech signal.
On the downlink receive output side, some communication device products use automatic volume control (AVC), dynamic gain compression, or spectral shaping of the received speech output to improve the intelligibility based on the listener's ambient noise environment. Such a system is described by Song et al. in US20060270467 A1, Nov. 30, 2006, “Method and Apparatus of increasing speech Intelligibility in Noisy Environments” and depends on an accurate estimate of the background noise for its operation.
Paramount to the successful operation of noise-related processing techniques is an accurate, current, short-term estimate of the background noise spectral energy. By short-term is meant over the duration of meaningful segments of speech, i.e. syllables and words. For stationary or slowly changing random noise sources this not usually a problem since the mean noise energy is constant over a period that is long relative to the speech. The sample average noise closely approximates the expected value and can usually be determined from a few signal segments identified as not containing speech. For nonstationary noises this is not the case as the noise may change rapidly relative to the speech modulation rate, requiring that the noise estimate be updated much more frequently. In the case of non-stationary noises or speech-like noise such as babble noise, many currently used common methods for tracking and estimating the noise can be lagging or error-prone resulting in faulty operation of the communication device's noise processors that rely on an accurate noise estimate. Thus, accurate methods for estimating and tracking nonstationary noises are useful and necessary.