For audio signal acquisition human/machine interfaces, especially in hands-free condition, adaptive beamforming microphone arrays have been widely employed for enhancing a desired signal while suppressing interference and noise. For full-duplex communication systems, not only interference and noise corrupt the desired signal, but also acoustic echoes originating from loudspeakers. For suppressing acoustic echoes, acoustic echo cancellers (AECs) using adaptive filters may be the optimum choice since they exploit the reference information provided by the loudspeaker signals.
To simultaneously suppress interferences and acoustic echoes, it is thus desirable to combine acoustic echo cancellation with adaptive beamforming in the acoustic human/machine interface. To achieve optimum performance, synergies between the AECs and the beamformer should be exploited while the computational complexity should be kept moderate. When designing such a joint acoustic echo cancellation and beamforming system, it proves necessary to consider especially the time-variance of the acoustic echo path, the background noise level, and the reverberation time of the acoustic environment. To combine acoustic echo cancellation with beamforming, various strategies were studied in the public literatures, reaching from cascades of AECs and beamformers to integrated solutions. These combinations address aspects such as maximization of the echo and noise suppression for slowly time-varying echo paths and high echo-to-interference ratios (EIRs), strongly time-varying echo paths, and low EIRs, or minimization of the computational complexity.
For full-duplex hands-free acoustic human/machine interfaces, often a combination of acoustic echo cancellation and speech enhancement is required to suppress acoustic echoes, local interference, and noise. However, efficient solutions for situations with high level background noise, with time-varying echo paths, with long echo reverberation time, and frequent double talk, are still a challenging research topic. To optimally exploit positive synergies between acoustic echo cancellation and speech enhancement, adaptive beamforming and acoustic echo cancellation may be jointly optimized. The adaptive beamforming system itself is already quite complex for most consumer oriented applications; the system of jointly optimizing the adaptive beamforming and the acoustic echo canceller could be too complex.
An ‘AEC first’ system or ‘beamforming first’ system which has lower complexity than the system of jointly optimizing the adaptive beamforming and the acoustic echo canceller. In the ‘AEC first’ system, positive synergies for the adaptive beamforming can be exploited after convergence of the AECs: the acoustic echoes are efficiently suppressed by the AECs, and the adaptive beamformer does not depend on the echo signals. One AEC is necessary for each microphone channel so that multiple complexity is required for multiple microphones at least for filtering and filter update in comparison to an AEC for a single microphone. Moreover, in the presence of strong interference and noise, the adaptation of AECs must be slowed down or even stopped in order to avoid instabilities of the adaptive filters. Alternatively, the AEC can be placed behind the adaptive beamformer in the ‘beamforming first’ system; the complexity is reduced to that of AEC for a single microphone. However, positive synergies can not be exploited for the adaptive beamformer since the beamformer sees not only interferences but also acoustic echoes.
Beamforming is a technique which extracts the desired signal contaminated by interference based on directivity, i.e., spatial signal selectivity. This extraction is performed by processing the signals obtained by multiple sensors such as microphones located at different positions in the space. The principle of beamforming has been known for a long time. Because of the vast amount of necessary signal processing, most research and development effort has been focused on geological investigations and sonar, which can afford a high cost. With the advent of LSI technology, the required amount of signal processing has become relatively small. As a result, a variety of research projects where acoustic beamforming is applied to consumer-oriented applications such as cellular phone speech enhancement, have been carried out. Applications of beamforming include microphone arrays for speech enhancement. The goal of speech enhancement is to remove undesirable signals such as noise and reverberation. Amount research areas in the field of speech enhancement are teleconferencing, hands-free telephones, hearing aids, speech recognition, intelligibility improvement, and acoustic measurement.
The signal played back by the loudspeaker is fed back to the microphones, where the signals appear as acoustic echoes. With the assumption that the amplifiers and the transducers are linear, a linear model is commonly used for the echo paths between the loudspeaker signal and the microphone signals. To cancel the acoustic echoes in the microphone channel, an adaptive filter is placed in parallel to the echo paths between the loudspeaker signal and the microphone signal with the loudspeaker signal as a reference. The adaptive filter forms replicas of the echo paths such that the output signal of the adaptive filter are replicas of the acoustic echoes. Subtracting the output signal of the adaptive filter from the microphone signal thus suppresses the acoustic echoes. Acoustic echo cancellation is then a system identification problem, where the echo paths are usually identified by adaptive linear filtering. The design of the adaptation algorithm of the adaptive filter requires consideration of the nature of the echo paths and of the echo signals.