Mobile communication systems allow a mobile phone to be used in different environments such that the voice of the near end user is mixed with a variety of types and levels of background noise surrounding the near end user. Mobile phones now have at least two microphones, a primary or “bottom” microphone, and a secondary or “top” microphone, both of which will pick up both the near-end user's voice and background noise. A digital noise suppression algorithm is applied that processes the two microphone signals, so as to reduce the amount of the background noise that is present in the primary signal. This helps make the near user's voice more intelligible for the far end user.
The noise suppression algorithms need an accurate estimate of the noise spectrum, so that they can apply the correct amount of attenuation to the primary signal. Too much attenuation will muffle the near end user's speech, while not enough will allow background noise to overwhelm the speech. Examples of other noise suppression algorithms include variants of Dynamic Wiener filtering such as power spectral subtraction and magnitude spectral subtraction.
To obtain an accurate noise estimate, a voice activity detection (VAD) function may be used that processes the microphone signals (e.g., computes their strength difference on a per frequency bin and per frame basis) to indicate which frequency bins (in a given frame of the primary signal) are likely speech, and which ones are likely non-speech (noise). The VAD function uses at least one threshold in order to provide its decision. These thresholds can be tuned during testing, to find the right compromise for a variety of “in-the-field” background noise environments and different ways in which the user holds the mobile phone when talking. When the difference between the microphone signals is greater, as per the selected threshold, speech is indicated; and when the difference is smaller, noise is indicated. Such VAD decisions are then used to produce a full spectrum noise estimate (using information in one or both of the two microphone signals).