Spatial directivity in audio conferencing systems can be achieved either through directional microphones or through proper combination of several omni-directional microphones (referred to as microphone array technology).
Beamforming may be used to discriminate a source position in a “noisy” environment by “weighting” or modifying the gain of the signal from each microphone to create a beam in a desired “look” direction toward the source (i.e. talker).
For full-duplex operation, acoustic echo cancellation must be performed to prevent reverberation, howling, etc. One approach is to perform acoustic echo cancellation on all of the microphone signals in parallel. However, this approach is computationally intensive since it requires as many acoustic echo cancellers running in parallel, as there are microphones in the conferencing device. Another approach is to perform acoustic echo cancellation on only one signal: the spatially filtered signal at the output of the beamformer (i.e. the output signal of the particular microphone facing the “look direction” at any given point in time).
The challenge that this second approach presents to acoustic echo cancellation is accommodating variations in the characteristics of this directional signal that vary with the spatial area that the system is pointing to. For example, the acoustic echo-path as well as the room characteristics (background noise, etc) may change suddenly as the system changes its look direction, for instance when switching to a different talker. As a result, the acoustic echo cancellation algorithm re-converges to the new characteristics (for instance new echo path) each time the system changes its look direction. These transitions result in under-performance of the system in terms of acoustic echo cancellation.
There are methods known in the prior art to combine multi-microphone directionality (beamforming) and acoustic echo cancellation. These generic structures presented in:                1. Chapter 13 of Microphone Arrays. Signal Processing Techniques and Applications by Michael Branstein and Darren Ward, published by Springer Verlag. 2001        2. W. Herbordt and W. Kellermann. Limits for Generalized Sidelob Cancellers with Embedded Acoustic Echo Cancellation. In Proc. Int. Conference on Acoustics, Speech, and Signal Processing (ICASSP), Salt Lake City, USA, May, 2001        3. H. Buchner, W. Herbordt and W. Kellermann. An Efficient Combination of Multi-Channel Acoustic Echo Cancellation With a Beamforming Microphone Array. Conf. Rec. International Workshop on Hands-free Speech Communication (HSC 2001), Kyoto, Japan, April 2001.        
The method set forth in reference [1], above, performs acoustic echo cancellation on the microphone signals (one AEC per microphone) such that the microphone signals inputs to the beamformer are clear of echo. In this structure, the AECs operates without any repercussion from the beamformer and the beamformer is undisturbed by acoustic echoes so that both functional blocks perform as expected. However, this approach requires multi-channel acoustic echo cancellation and therefore is computationally demanding.
A computationally more effective structure places the AEC behind a beamformer, as set forth in reference [2], above. With this method only one acoustic echo canceller is required. However in this case the beamformer is a part of the echo path impulse response that the AEC has to model (i.e. adapt to). If the beamformer has to track multiple (or moving) sources, which is common for teleconferencing, then the AEC is challenged by the sudden changes in the echo impulse response every time the beamformer switches to the new local source (i.e. talker). This may result in poor echo cancellation until the AEC is re-adapted to the new echo path.
To overcome the problem of time variations in the echo path when the acoustic echo cancellation is performed after beamforming a compromised structure is suggested in reference [3], above. In this method, acoustic echo cancellation is performed at the outputs of N fixed beamformers (where N<Number of microphones in the array) covering N look directions. The signals passed to the time-varying beamformer are clear of echo and therefore able to react to newly active local sources and/or interferences. This structure is a compromised solution between the first and the second ones (i.e. references [1] and [2]), however it still requires multi-channel acoustic echo cancellation that is not computationally efficient. One object of the present invention is to improve performance of the acoustic echo cancellation that operates on spatially filtered signals while preserving low computational cost.