In audio communication, it is typically expedient to transmit a user's voice undistorted and free of noise. However, communication devices are often employed in noisy environments; the signals picked up by a device's microphones are mixtures of the user's voice and interfering noise.
The characteristics of the sound field at the microphones vary substantially across different signal and noise scenarios. For instance, the sound may come from a single direction or from many directions simultaneously. It may originate far away from—or close to the microphones. It may be stationary/constant or non-stationary/transient. The noise may also be generated by wind turbulence at the microphone ports.
Multi-microphone background noise reduction methods fall in two general categories. The first type is beamforming, where the output samples are computed as a linear combination of the input samples. The second type is noise suppression, where the noise component is reduced by applying a time-variant filter to the signal, such as by multiplying a time and frequency dependant gain on the signal in a filter bank domain.
When only one microphone or audio input is available, a noise suppression filter cannot be spatially sensitive. There is no access to the spatial features of the sound field, providing discriminative information about speech and background noise, and is typically limited only to suppress the stationary or quasi-stationary component of the background noise.
Beamforming and noise suppression may be sequentially applied, since their noise reduction effects are additive.
An example of an adaptive beamformer is disclosed in WO 2009/132646 A1.
A method of separating mixtures of sound is disclosed in “O. Yilmaz and S. Rickard, Blind Separation of Speech Mixtures via Time-Frequency Masking, IEEE Transactions on Signal Processing, Vol. 52, No. 7, pages 1830-1847, July 2004”. Separation masks are computed in a time-frequency representation on the basis of two features, namely the level difference and phase-delay between the two sensor signals.
A method of combining directional noise suppression and a stationary noise suppression algorithm is disclosed in WO 2009/096958 A1. However, this method does not take into account a spatial noise suppression component which takes advantage of combining a set of spatially discriminative features besides directional features.