Noise suppression techniques often reduce noise in an audio signal by classifying a signal component as speech or noise. In multi-microphone systems, classification can be made by determining the energy difference between each microphone signal and distinguishing between a speech source and noise sources based upon orientation and proximity of the source relative to the microphone array.
Classifying noise vs. speech based on microphone energy differences is not always possible. For example, microphone sensitivity variances can make it difficult to reliably determine source location by comparing energy levels from different microphones. Additionally, fairly common conditions can blur the distinction in terms of energy level differences between a noise frame and a speech frame, such as a user speaking into a phone from a greater distance (far talk use case), hand occlusion that covers up a microphone during use, and other conditions. As such, the probability distribution of microphone energy level difference for noise overlaps the probability distribution of microphone energy level difference for speech.
To overcome the shortcomings of the prior art, there is a need for an improved noise suppression system for classifying noise and speech.