Adaptive microphone array processing, such as adaptive beam-forming, is known for use with speech and audio signal capture systems. Typically such processing is employed for suppression of environmental interference or enhancement of a desired speech signal.
Acoustic echo controllers (AECs) are used in hands-free and full-duplex audio communication systems to cancel and suppress the acoustic echoes that originate from loudspeakers. For example, in a conference call, a speaker's reproduced voice at the far end (the listeners end) may be captured by microphones at that far end. It may be captured multiple times, as the reproduced voice scatters or diffracts off of surfaces within the room where the listener is located. In addition to these echoes, environmental interference is often present during a conference call which would ideally be removed. For example, the sound of the PC delivering the conference call, air conditioning etc. are all sounds which would preferably not be transferred to the far end. Not only would the transference of these sounds degrade the overall quality of the conference call, they would also utilize bandwidth unnecessarily.
For simultaneous suppression of both acoustic echoes from loudspeakers and environmental interference, it is necessary to combine adaptive microphone array processing, particularly adaptive beam-forming, with acoustic echo controllers.
As described below, there are a number of known techniques [1] for combining AEC with adaptive beam-forming solutions.
Using an “AEC first” technique, AEC is applied before a beam-former. This means that a system for implementing this technique will require one AEC per microphone channel. This method has several drawbacks. Firstly, the computational complexity of the system will be high where a large number of microphones are involved. Secondly, each microphone channel is capable of picking up the entire sound-field from a room. This will include all stationary and non-stationary noise/interference, room boundary reflections and room reverberations. These interferences can slow down the adaptation of an echo canceller. Furthermore, adaptive filters with many taps are required to handle the long echo tail i.e. the adaptive filter will require more computational resource in order to filter the echo tail. The large number of taps can also increase the AEC complexity. Finally, conventional echo control solutions utilize linear echo cancellation and non-linear residual echo suppression. Linear echo cancellation may delay the signals being processed, but the phase of the signal is not distorted. Whereas non-linear residual echo suppression may destroy the linearity of the system, by introducing non-linear phase delays, and therefore limit the adaptation and performance available from the adaptive beam-former.
Alternatively, a “beam-former first” approach can been used. In this method, the beam-forming [2,3] is applied before AEC. A major drawback in this method is that, due to the faster adaptation speed of beam-forming, whilst the AEC will see fast time-variant impulse response it will not be able to adapt to it in time. This results in a degradation of AEC performance.
Finally, there are known “joint optimization of AEC and adaptive beam-forming” methods where a combination of adaptation of the above methods is undertaken. However this method is not compatible with existing AEC and beam-forming solutions. Therefore new algorithms must be designed, tested, and tuned for the various user scenarios which are envisaged.