Beamforming is a technique for providing spatial directivity in communication systems, such as audio conferencing systems. Beamforming can be implemented with directional microphones, or combinations of omni-directional microphones forming a microphone array. Beamforming can be used to discriminate a source position in a noisy environment by adding a weighted combination of the signals from each microphone. This creates a desired “look” direction aimed at the source, or talker, position. Beamformers are logical elements that correspond to the combination of one or more microphone inputs. Multiple beamformers are typically provided to give a number of look directions or sectors. The beamformers can be fixed, or they can provide adaptive beamforming to minimize undesired near-end noise or interference signals in real time. In adaptive beamforming, the impulse responses or gains of the microphones are dynamically adjusted to optimize source location, signal-to-noise ratio, or other desired audio characteristics in changing acoustic environments. Adaptive beamforming systems are complex, computationally intensive and suffer from general robustness issues, such as desired signal cancellation. It is often simpler to provide sector-based beamforming in consumer and enterprise devices, such as conference telephones. Such units are designed to provide a sufficient number of static beams (e.g. twelve equally spaced beams) to accommodate a number of call participants arrayed around the base unit. The conference unit can switch between static beamformers using a state machine, typically based on talker localization.
In full-duplex audio communication systems, acoustic echo cancellation (AEC) is typically applied to prevent reverberation, howling and other undesirable effects. For example, in speaker phones, a portion of the audio signal output by the loudspeaker and reflected in the reverberant environment is received by the microphones. Unless it is compensated for, this phenomenon is distracting to participants in telephone calls and is considered a nuisance. Adaptive AEC techniques are well known in the art.
The efficient integration of beamforming and AEC continues to be a challenge. One approach has been to perform AEC first on all the input microphone signals, in parallel, prior to beamforming. This approach has a prohibitive computational cost, because it requires as many acoustic echo cancellers, running in parallel, as there are sensors in the device.
In another common approach, the beamforming is performed first, and a single acoustic echo canceller is placed at the output of the beamformer. Due to differing physical characteristics, such as furniture placement, room design, and location of participants, each beamformer will have different echo characteristics. When the look direction or beamforming coefficients change, the AEC algorithm must adapt to the new echo characteristics. This approach presents a challenge to the AEC operation, because the directional signal has characteristics that vary according to the spatial area to which the system is looking. For example, the acoustic echo path and room characteristics (background noise, etc) may change suddenly as the system changes its look direction to accommodate a new talker. Without special care, the AEC algorithm must converge to very different cancellation coefficients each time the system changes its look direction. This can result in poor echo cancellation until the AEC algorithm converges, and, accordingly, poor transitions between beamformers, particularly if the AEC does not quickly converge to the required cancellation coefficients.
For the general case of time-varying, or adaptive, beamforming, several structures have been recently proposed to combine the optimization processes of beamforming and echo cancellation into a single optimization process. The proposed structures include those of: W. Kellermann, “Acoustic Echo Cancellation for Beamforming Microphone Arrays,” in Microphone Arrays. M. Brandstein, D. Ward (ed.), Springer, Berlin, May, 2001, pp. 281-306; W. Herbordt, S. Nakamura, W. Kellermann, “Joint Optimization of LCMV Beamforming and Acoustic Echo Cancellation for Automatic Speech Recognition,” Proc. of IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP) 2005, March 2005; K.-D. Kammeyer, M. Kallinger, A. Mertins, “New Aspects of Combining Echo Cancellers with Beamformers,” Proc. of IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP) 2005, March 2005; and US Patent Publication No. 2002/0015500, entitled Method And Device For Acoustic Echo Cancellation Combined With Adaptive Beamforming, to Belt et al. These methods tend to be complex and are not optimal for sector-based beamformers, where a known number of fixed beamformers are used to cover a desired spatial area in concert with an apparatus used to switch from one beamformer to another. The described methods are also designed for slowly varying beamformers, such as adaptive noise or interference cancellation beamformers, and do not efficiently handle sudden and possibly drastic changes that occur in a switched-beamforming structure.
To take advantage of the somewhat simpler framework of sector-based, switched beamforming, US Patent Publication No. 2004/0125942, entitled Method Of Acoustic Echo Cancellation In Full-Duplex Hands-Free Audio Conferencing Systems With Spatial Directivity, to Beaucoup et al., the contents of which are incorporated herein by reference, proposes storing the information pertaining to echo cancellation for each “sector”, in memory. According to this structure, the information pertaining to echo cancellation for each fixed beamformer is stored in memory as a workspace, and retrieved from memory the next time the talker localization algorithm re-selects the sector. This structure works well and provides smooth transitions from beamformer to beamformer in stationary, or essentially time invariant, acoustic environments. It does suffer from problems, however, in non-stationary (time-varying) acoustic environments. To be precise, if the acoustic environment, or echo path for a particular beamformer, changes significantly between two utterances of a beamformer being chosen by the localization algorithm, then the information stored in the workspace for the particular beamformer no longer provides good echo cancellation and the performance of the system will degrade. Another drawback of this approach is that, in order to be entirely trained in terms of echo cancellation, the device needs to operate in all possible beamformer positions and perform AEC in each position.
In order to minimize the amount of information that has to be stored in the workspaces, a method was proposed in F. Beaucoup, “Parallel Beamformer Design Under Response Equalization Constraints” Proc. of IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP) 2004, Montreal, Canada, May 2004, to optimally design the fixed beamformers to ensure that they all have the same response to a certain equalization signal. For example, the equalization signal can be chosen to be as close as possible to the loudspeaker-coupling signal. This coupling equalization approach, however, can only provide limited improvement in practice, because the loudspeaker-coupling signal can only be known a priori with a limited accuracy. One reason for this is that it is only possible to determine the direct-path coupling signal, i.e. the coupling signal resulting from the direct feedback between the loudspeaker and the microphones, at the design stage. The indirect-path coupling signal, resulting from reflections from various objects in the acoustic environment, depends on the acoustic environment in which the device is operated and cannot be known in advance. Even if only the direct-path signal is targeted, which is reasonable since it accounts for most of the energy of the echo, other factors come into play that limit the accuracy of a priori knowledge. These factors include loudspeaker-induced structural vibrations, acoustic leakage, and component and manufacturing variability. Therefore, in practice, this design method can only be used to minimize the amount of information that needs to be stored in each workspace, and does not solve the problem of optimizing the integration of beamforming and AEC.
Therefore, it is desirable to provide a communication system and method that can provide rapid adaptive coupling equalization in beamforming-based communication systems, particularly sector-based beamforming systems, in order to provide smooth transitions for acoustic echo cancellation when the look direction of the communication system changes and when the acoustic environment varies with time.