Echo cancellation and beamforming can be used almost everywhere in signal processing. It may be noted that these are active research topics which are of interest in mobile communications, but also in other models of a transmission channels in the widest sense.
Currently, beamforming using several microphones has been mainly utilized for example in military, for example in radar applications for target finding or in phased arrays also named beam steering, intelligence, and professional teleconference or videoconference applications. Aforesaid applications can be usually characterized by high system costs in conjunction with relatively low manufacturing volumes.
Currently, the first low cost consumer products for mobile communication applying microphone arrays and advanced audio processing using signal processing hardware or software, for example a digital signal processor system having memory or a dedicated integrated circuit, for example using very large scale integration, enter the market.
Basically, algorithms for Acoustic Echo Cancellation (AEC) systems model a dynamically changing acoustic echo path from the loudspeaker to the microphone using an adaptive filter. There are many different algorithms and methods proposed to implement acoustic echo cancellation systems (see S. Haykin, “Adaptive Filter Theory”, Prentice Hall, N.J., 1996), but such systems all have the common objective to estimate the transfer function of the acoustic echo path.
Communication systems are typically used by people in a time varying acoustic environment. Especially with portable communication devices, where the user moving and handling the device, user's movements in the vicinity of the active acoustic transducers (microphone and loudspeaker) can introduce radical changes in the echo path. Since the acoustic echo path can change in time, also the Acoustic Echo Cancellation (AEC) filters are designed to be adaptive in time.
Adaptive AEC filters, by definition, require some adaptation time, for the estimation of the transfer functions based on the available history of the input data. In addition to the theoretical maximum adaptation speed of the adaptive filter also other time varying changes in the operation conditions, such as changes in near-end speech activity or noise environment, can slow down the AEC algorithm in reaching a good model of the acoustic echo path. Therefore, AEC algorithms are not able to recover from sudden changes in the echo path instantaneously causing temporary echo leakage to the AEC output. This residual echo at the AEC output can cause additional problems such as audible echo, uplink signal distortions, and interferences for background noise reduction algorithms or even howling.
For full-duplex communication, the performance of the adaptive echo cancellation is heavily dependent on the accuracy of the linear echo path model. A good linear echo cancellation will enable simultaneous two-way communication without the need for significant residual echo suppression. Nonlinear signal suppression is the common approach in acoustic echo control systems to achieve sufficiently high echo level reduction. There are also some standards that set some limits to the echo cancellation performance e.g. ITU-T G.165.
One of the key steps towards full-duplex communication is to improve the performance of the AEC system. The AEC comes easier, if the microphone can always be placed close to the desired source (e.g. user's mouth) and the loudspeaker is placed far away from the microphone in fixed transducer locations. In the best situation all time variant changes (moving people) would be minimized so that after the initial adaptation in the beginning of the call, there would be no need for echo path model adaptation. Unfortunately, however, all of these requirements are in conflict with the product requirements of the typical mobile devices and their typical usage.
Therefore, more robust algorithms and methods are needed to improve the desired signal pickup and AEC processing in a very challenging environment with disturbances and unpredictable acoustic changes. One potential method for improving the desired signal capture from a distance is to replace the microphone with a directional microphone or an array of microphones. Integration of AEC algorithms and multi-microphone beamforming algorithms is not a trivial task, since both of these algorithms can be time varying (adaptive), which can lead to a situation where the AEC operation becomes dependent on beamforming filter operation and vice versa.
A straightforward low complexity integration of a single channel AEC with multi-microphone system leads to an approach, where the beamforming filter is first applied to the multi-channel microphone signal and second the AEC filter is applied to the single channel uplink signal at the output of the beamforming filter.
However, if the beamforming filter supports beam steering, dynamic steering will also change the echo path from the loudspeaker signal to the beamforming filter output signal and in this way disturb the AEC adaptation. Although the integration of multi-microphone beamforming and AEC technologies have been studied in the literature, so far dynamic beam steering has not been studied in detail. Aforesaid phenomenon is also present in the current multi-microphone beamforming car hands-free product HF-6W of the applicant.
Several prior art technologies have been proposed to integrate AEC with time-invariant beamforming without any technical problems. But, the practical situation changes significantly when the AEC is used with time-varying beamforming front-end.
The problem arises from the fact that when the multi-microphone beamforming (MMBF) filter is changed (steered from one direction to another) this will also change the transfer function of the echo path. Sudden echo path change implies that the AEC filter will have an imprecise model of the echo path and this will increase the echo signal leakage through the AEC filter.
The AEC adaptation will correct the model and this way tries to reduce the residual echo level. For example, MMBF beam steering can cause an instantaneous change by 180 degrees in the microphone array look direction from looking away from the loudspeaker to looking towards the loudspeaker. This type of radical change can cause significant changes in the residual echo level as well as significant changes in the dominating room reflections.
For example, let us consider a teleconference application where multiple persons use the same device in the same acoustic space. In this situation beamforming technologies can provide significant improvement in the uplink signal-to-noise ratio (SNR) when the beam is steered towards the active speaker. This type of automatic active source detection and beam steering technology could easily introduce situations where the changes in beam directivity can be instantaneous, unpredictable, and frequent in time. If the AEC processing is applied to the MMBF output signal then also the AEC algorithm would be forced to track instantaneous, unpredictable and frequent echo path changes.
The degree of echo path variability caused by the dynamic beam steering in joint MMBF-AEC operation can be influenced with appropriate industrial design (large microphone-to-loudspeaker distance), appropriate microphone directivity control (low microphone sensitivity towards the loudspeaker), and possible MMBF steering controlled AEC adaptation methods. However, the fundamental conflict in requirements of dynamically steered MMBF and adaptation of AEC algorithms cannot be sufficiently avoided without proper arrangement of the AEC processing. In order to avoid this problem completely the AEC processing should be applied to the microphone signal path before the time-varying MMBF processing manipulates the echo path. The only prior art approach that is known to the inventor applies dedicated AEC filters for every microphone signal before the (time varying) beamforming stage. In this case the number of AEC filters, the computational complexity, and the required runtime memory would increases linearly with the number of microphones, which can easily limit the maximum number of microphones the product can support.
In mobile applications both the end-user and the device can move and therefore a fixed speaker direction cannot be assumed. Therefore, it is necessary to estimate the direction of arrival (DOA) of the desired signal and utilize this spatial information to steer the beam pattern towards the desired source to improve the uplink SNR by attenuating the undesired signal components (noise) and undesired room reflections from the desired signal (speech). Hence, time-varying beam steering and acoustic echo cancellation are both needed for high performance mobile telephony applications utilizing MMBF technologies.
In a general case it would be very challenging to model the dynamic characteristics of the echo path changes. When each user has a different way of using the device, every acoustic space has unique reverberation characteristics, acoustic transducers have statistical variations, and all these factors tend to change in time, the echo path characteristics have an infinite number of different combinations and factors affecting the echo path statistics. Even if there were significant statistical redundancy in practical echo path models, the Applicant is unaware of any feasible transfer function prediction method that could be applied for real-time prediction of dynamic echo path changes.
Academic literature recognizes a few basic configurations how to integrate time-varying beamforming with acoustic echo cancellation (see Kellerman W. L.: “Acoustic Echo Cancellation for Beamforming Microphone Arrays. In Microphone Arrays”, Eds. Branstein M., Ward D. Springer-Verlag, New York, 2001; W. Herbordt, W. Kellermann, and S. Nakamura: “Joint Optimization of LCMV Beamforming and Acoustic Echo Cancellation”, Proc. EURASIP European Signal Processing Conference (EUSIPCO), Vienna, Austria, September 2004; W. Herbordt, W. Kellermann, “Limits for Generalized Sidelobe Cancellers with Embedded Acoustic Echo Cancellation”, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), Salt Lake City, May 2001). Kellermann has (in above second citing) reviewed four alternative AEC-beamformer configurations without considering the beam steering functionality.
Partition of beamforming filter to time invariant front-end and the time varying post-filter (see EP 1184676; Kajala M. and Hämäläinen M., “Filter-and-sum Beamformer with Adjustable Filter Characteristics”, In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, July 2001) so that the time-variant filter is following the time-invariant processing making it possible to process the intermediate signals between these two stages.
This Polynomial Beamforming Filter (PBF) is a favourable structure for applying AEC processing to the intermediate signals. The processed intermediate signals containing the microphone signals containing low level residual echo components are passed to the PBF post-filter. When the PBF post-filter is applied to the intermediate signals to steer the beam towards any direction independent of the previous AEC processing according to the prior art related to the PBF filter structure.
The proposed invention has several possible embodiments. Most of these variants are obvious from the previous IPR on PBF filtering but also multi-channel (MIMO) configuration has been considered. The beamforming system view is given in FIG. 6. These figures should be relatively easy to interpret for an expert in the field.
In all examples time-varying beamforming refers to adaptive beamformer operation where the adaptation is following statistical signal variations, like in Adaptive Interference Cancellers (AIC) and Generalized Sidelobe Cancellers (GSC). Therefore, AEC integration with dynamically steered beamformer front-end has not been considered in the extent of this invention. This is partly due to the fact that other researches have not considered the usage of the PBF (polynomial beamforming filter) structure. Also, some applications like hearing aids may keep the beam directivity fixed, because the user can adjust the directivity pattern easily by turning his/hers head towards to the sound. Moreover, in this invention the formulation of the MMBF-AEC problem is more generalized and complex compared to the existing academic literature.
An alternative approach to the steering of an independent AEC integration problem is to assign separate AEC filters for each microphone input before MMBF processing.