Microphone arrays are increasingly being recognized as an effective tool to combat noise, interference, and reverberation for speech acquisition in adverse acoustic environments. Applications include robust speech recognition, hands-free voice communication and teleconferencing, hearing aids, to name just a few. Beamforming is a traditional microphone array processing technique that provides a form of spatial filtering: receiving signals coming from specific directions while attenuating signals from other directions. While spatial filtering is possible, it is not optimal in the minimum mean square error (MMSE) sense from a signal reconstruction perspective.
One conventional method for post-filtering is the multichannel Wiener filter (MCWF), which can be decomposed into a minimum variance distortionless response (MVDR) beamformer and a single-channel post-filter. Currently known conventional post-filtering methods are capable of improving speech quality after beamforming; however, such existing methods have two common limitations or deficiencies. First, these methods assume the relevant noise is only either white (incoherent) noise or diffuse noise, thus the methods do not address point interferers. Point interferers are, for example, in an environment with multiple persons speaking and where one person is a desired audio source, the unwanted noise coming from other speakers. Second, these existing approaches apply a heuristic technique where post-filter coefficients are estimated using two microphones at a time and then averaged over all microphone pairs, which leads to sub-optimal results.