A device for bi-directional audio-based communication typically includes both a loudspeaker and a microphone. The loudspeaker is used to play back audio signals received from a remote (“far-end”) source, while the microphone is used to capture audio signals from a local (“near-end”) source. In the case of a telephone call, for example, the near- and far-end sources may be people engaged in a conversation, and the audio signals may contain speech. An acoustic echo occurs when the far-end signal emitted by the loudspeaker is captured by the microphone, after undergoing reflections in the local environment.
An acoustic echo canceller (AEC) may be used to remove acoustic echo from an audio signal captured by a microphone, in order to facilitate improved communication. The AEC typically filters the microphone signal by determining an estimate of the acoustic echo, and subtracting the estimate from the microphone signal to produce an approximation of the true near-end signal. The estimate is obtained by applying a transformation to the far-end signal emitted from the loudspeaker. The transformation is implemented using an adaptive algorithm such as least mean squares, normalized least mean squares, or their variants, which are known to persons of ordinary skill in the art.
The adaptive transformation relies on a feedback loop, which continuously adjusts a set of parameters that are used to calculate the estimated echo from the far-end signal. The adaptation functions more effectively in some situations than in others. For example, adaptation is most effective when the far-end signal is active (e.g., because a person on the far end is talking), and the near-end signal is inactive (e.g., because nobody on the near end is talking). In this situation, referred to as “receive single talk,” the microphone signal will include only the acoustic echo. Therefore, the adaptive parameters can simply be adjusted until they yield an estimated echo that matches the actual echo present in the microphone signal. In this way, the feedback loop will cause the estimated echo to converge on the actual echo.
Adaptation is less effective in the situation referred to as “double talk,” when the far-end signal and the near-end signal are both simultaneously active. In the presence of double talk, the microphone signal will include both the near-end signal and the acoustic echo. The AEC may be unable to adequately distinguish between the different components of the microphone signal. Accordingly, if the feedback loop continues during double talk, the estimated echo may diverge from the actual echo, and the AEC may no longer cancel the acoustic echo satisfactorily. In order to prevent such divergence, AECs typically rely on double-talk detectors, which may be used to decrease the rate of adaptation or stop it altogether during periods of double talk.
In some situations, it may be useful to increase the rate of adaptation beyond the normal level. For example, when the communication device changes position relative to objects or people in the local environment, the reflections that produce the acoustic echo may change as well. This is referred to as an echo path change. When an echo path change occurs, the estimated echo produced by the AEC may no longer provide a good approximation of the actual acoustic echo. The AEC's adaptive parameters will eventually be updated by the feedback loop, but until then the quality of the AEC output will be diminished. Therefore, AECs typically rely on echo path change detectors (EPCDs), which may be used to temporarily increase the rate of adaptation.
As described above, AECs are typically configured to slow adaptation during double talk and speed adaptation during echo path changes. These adaptation strategies are mutually incompatible, so a conflict may arise when double talk and echo path changes occur simultaneously. Such conflict can be resolved by giving precedence to double talk over echo path changes, so that adaptation is slowed or halted when both occur at the same time. An AEC may be configured to implement such precedence by selectively disregarding the output of an EPCD based on the output of a double talk detector. It may be preferable, however, to configure the EPCD so that echo path changes are not detected at all during double talk. This way, echo path change and double talk will never be detected together, and the proper adaptation strategy can be easily determined in any given circumstance.