Echoes arise at various points in a voice communication system. Without proper control they can cause significant degradation in conversation quality. In telecommunications systems, echoes can be caused by an impedance mismatch between two-wire local customer loops coupled to four-wire long distance trunks. If the impedance between the two systems is matched a communication signal passes between them without causing an echo. For various reasons it is difficult to match the impedances exactly. When there is a mismatch part of the signal is reflected back to the far-end talker as an echo. The situation can be further complicated by the presence of two-wire toll switches, allowing intermediate four-two-four wire conversions internal to the network. In telephone connections using satellite links with round-trip delays on the order of 600 ms, line echoes can become particularly disruptive.
In a hands-free communication system, such as videoconferences, mobile telephones and computer multimedia applications, echoes are caused by the microphone picking up sound from the loudspeaker and feeding it back to the far-end talker as an echo. Often the loudspeakers and microphones are placed at a distance from the participants making the echo sound loud and clear. Furthermore, the sound reverberation time in a typical office sized room is several hundreds of milliseconds, and even less in a vehicle. This corresponds to several hundred samples at a sampling rate of 8 kHz, which creates high complexity in the implementation of an acoustic echo canceller.
In a typical hands-free environment, a near-end space contains a loudspeaker and a microphone. A far-end talker produces a speech signal which is broadcast over the loudspeaker, with an echo path created by sound from the loudspeaker received by the microphone. The echo path can be represented by an unknown transfer function that varies with changes in the space such as movement of objects within the space like the loudspeaker, microphone or people and even changes in ambient temperature. An echo is transmitted back to the far-end as a return signal y(n). Furthermore, an additional component v(n) may comprise background noise or double-talk (such as a person in the space who is also talking) will also be detected by the microphone and form part of the return signal y(n).
Echo cancellers have been developed to suppress these echoes in communication systems. A typical echo canceller includes an adaptive filter and a subtractor. The incoming signal is passed to the adaptive filter which attempts to model the echo path and estimate the echo. The estimate is subtracted from the return signal to produce an error signal. The error signal is then fed back to the adaptive filter, which adjusts its filter coefficients in order to minimize the error signal. The filter coefficients converge toward values that optimize the estimate signal in order to cancel the echo signal. Echo cancellers are deployed in every telephone network, and are essential for any hands-free speech devices.
In order for the adaptive filter to correctly model the echo path, the return signal y(n) of the echo path must originate solely from its input signal. During double-talking, speech at the near-end that acts as uncorrelated noise can cause the filter coefficients to diverge. Coefficient drift is usually not catastrophic although a brief echo may be heard until convergence is established again. In closed-loop paths (which typically include acoustic echo paths) coefficient drift may lead to an unstable system which causes howling and makes convergence difficult. To alleviate this problem double-talk detectors are commonly used for disabling the adaptation during the occurrence of double-talk. Unfortunately, double-talk detectors fail to indicate the presence of double-talk for a whole syllable after double-talk begins. During this time the coefficients may drift and lead to howling as mentioned above. Furthermore, double-talk becomes increasingly difficult to detect as an acoustic echo becomes large in comparison to the near-end signal.
One adaptive echo canceller arranged for overcoming the double-talking problem is proposed by Ochiai et al. in “Echo canceller with two echo path models”, IEEE Transactions on Communications, 25(6): 589-595, 1977. This document describes an echo canceller with a fixed (non-adaptive) foreground filter and an adaptive background filter. Each of the filters generates an estimate of the echo signal, the filter coefficients of the foreground filter are replaced with those of the background filter when the background filter provides a better estimate of the echo signal than the foreground filter. During uncorrelated double-talking, the foreground filter is relatively immune from coefficient drift in the background filter. There are, however, drawbacks to this approach. In the event the filter coefficients of the background filter diverge and are subsequently re-converged, there may be a relatively long delay before the background filter works back to providing a better error signal than the foreground filter. As a result, the convergence time for the foreground filter may be significantly delayed. This is particularly serious when double-talk is followed immediately by echo path variations because the echo canceller fails to track any variation until the background filter is re-converged. This causes a significant degradation of speech conversation.
For an echo canceller with two filters, the foreground filter is non-adaptive while the secondary filter uses a least-squares-type technique such as least squares, least mean squares, or normalized least mean squares (NLMS). Double-talk causes divergence in filters adapted with these kinds of techniques which results in a time delay before the filter coefficients return to track echo path variations again. Since echo paths continually change echoes can become apparent during this delay.
To improve on this prior approach, U.S. Pat. No. 6,947,549 discloses an echo canceller having two parallel filters and a controller. The controller chooses the best of the two filters for the final echo cancellation. The filter coefficients are exchanged between the two filters constantly so that both filters retain performance all the time even if double-talk and echo path variations occur very closely in time. One filter is strongly robust and will not diverge during double-talk, while the other filter is weakly robust and will converge rapidly during echo path variation. In the instance when the slower canceller is perfectly converged and near in noise is introduced at the microphone, the perfectly converged canceller would diverge, which frustrates the goal of the echo canceller in a high noise environment.
Other approaches to echo cancellation or noise reduction are disclosed in U.S. Pat. Nos. 6,792,106; 6,608,897; 6,526,141; and 5,664,011.