1. Field of the Invention
The present invention relates to echo cancellation in communication systems, and more particularly to echo cancellers.
2. Background Information
Echoes arise at various points in a voice communication system. Without proper control they can cause significant degradation in conversation quality.
In telecommunications systems echoes are caused by an impedance mismatch between two-wire local customer loops coupled to four-wire long distance trunks. If the impedance between the two systems is matched a communication signal passes between them without causing an echo. For various reasons it is difficult to match the impedances exactly. When there is a mismatch part of the signal is reflected back to the far-end talker as an echo. The situation can be further complicated by the presence of two-wire toll switches, allowing intermediate four-two-four wire conversions internal to the network. In telephone connections using satellite links with round-trip delays on the order of 600 ms, line echoes can become particularly disruptive.
FIG. 1 shows a block diagram example of a telecommunications system in which echoes occur. The system comprises conventional telephone set 1, connected to transformer 2 which is fed by an analogue-to-digital (A/D) converter 3. The transformer 2 feeds a conventional digital-to-analogue (D/A) converter 4. The A/D 3 receives a far-end speech signal x(n) from a far-end talker. As a result of the imperfect coupling at the four to two wire junction of transformer 2 the return signal y(n) generated by D/A 4 may contain an echo. Furthermore, the return signal may contain an additional component comprising any ambient noise picked up by telephone 1, and any double-talk which the near-end user of telephone 1 may produce.
In a hands-free communication system, such as videoconferences, mobile telephones and computer multimedia applications, echoes are caused by the microphone picking up sound from the loudspeaker and feeding it back to the far-end talker as an echo. Often the loudspeakers and microphones are placed at a distance from the participants making the echo sound loud and clear. Furthermore, the sound reverberation time in a typical office sized room is several hundreds of milliseconds. This corresponds to several hundred samples at a sampling rate of 8 kHz, which creates high complexity in the implementation of an acoustic echo canceller.
FIG. 2 shows a block diagram example of a hands-free environment. It includes a (near-end) receiving room 5, which contains a conventional loudspeaker 6 and a conventional microphone 7. A far-end talker produces a speech signal x(n) which is broadcast over loudspeaker 6. An echo path 8 is created by sound from loudspeaker 6 being received by microphone 7. The echo path 8 is represented by an unknown transfer function h(n) that varies with changes in the room 5 environment such as movement of objects such as the loudspeaker 6, microphone 7 or people; the opening and closing of doors; and changes in room 5 temperature. An echo is transmitted back to the far-end as a return signal y(n). Furthermore, an additional component v(n) comprising background noise or double-talk (a person in room 1 who is also talking) will also be detected by the microphone 7 and form part of the return signal y(n).
Echo cancellers have been developed to suppress these echoes in communication systems. An echo canceller includes an adaptive filter and a subtractor. The incoming signal is passed to the adaptive filter which attempts to models the echo path and estimate the echo. The estimate is subtracted from the return signal to produce an error signal. The error signal is then fed back to the adaptive filter, which adjusts its filter coefficients in order to minimize the error signal, The filter coefficients converge toward values that optimize the estimate signal in order to cancel the echo signal. Echo cancellers are deployed in every telephone network, and are essential for any hands-free speech devices.
Referring to FIGS. 1 and 2, a known acoustic echo canceller 10 comprises an adaptive filter and update module 11, and a conventional subtractor 12. The filter ĥ(n) models the echo path and produces an estimate ŷ(n) of the echo signal y(n). This estimate is subtracted from the return signal y(n) by subtractor 12 to generate the error signal e(n). The error signal e(n) is returned to the far-end of the telecommunications system. In a closed loop system the error signal e(n) is also feedback to the update module which tries to minimise the error signal e(n) by adapting the coefficients of the filter ĥ(n).
In order for the adaptive filter to correctly model the echo path, the output signal y(n) of the echo path must originate solely from its input signal x(n). During double-talking, speech at the near-end that acts as uncorrelated noise causing the filter coefficients to diverge In open-loop paths, coefficient drift is usually not catastrophic although a brief echo may be heard until convergence is established again. In closed-loop paths (which typically include acoustic echo paths) coefficient drift may lead to an unstable system which causes howling and makes convergence difficult. To alleviate this problem double-talk detectors are commonly used for disabling the adaptation during the occurrence of double-talk. Unfortunately, double-talk detectors fail to indicate the presence of double-talk for a whole syllable after double-talk begins. During this time the coefficients may drift and lead to howling as mentioned above. Furthermore, double-talk becomes increasingly difficult to detect as an acoustic echo becomes large in comparison to the near-end signal.
An adaptive echo canceller arranged for overcoming the double-talking problem is proposed by Ochiai et al. in Echo canceller with two echo path models, IEEE Transactions on Communications, 25(6): 589-595, 1977. This document describes an echo canceller with a fixed (non-adaptive) foreground filter and an adaptive background filter. Each of the filters generates an estimate of the echo signal, The filter coefficients of the foreground filter are replaced with those of the background filter when the background filter provides a better estimate of the echo signal than the foreground filter. A similar system is disclosed U.S. Pat. No. 5,664,011 (Crochiere et al) where the foreground filter coefficients are replaced by the sum of the foreground and background filter coefficients.
During uncorrelated double-talking, the foreground filter is relatively immune from coefficient drift in the background filter. There are, however, drawbacks to this approach. In the event the filter coefficients of the background filter diverge and are subsequently re-converged, there may be a relatively long delay before the background filter works back to providing a better error signal than the foreground filter. As a result, the convergence time for the foreground filter may be significantly delayed. This is particularly serious when double-talk is followed immediately by echo path variations because the echo canceller fails to track any variation until the background filter is re-converged. This causes a significant degradation of speech conversation.
For an echo canceller with two filters, the foreground filter is non-adaptive while the secondary filter uses a least-squares-type technique such as least squares, least mean squares, or normalized least mean squares. Double-talk causes divergence in filters adapted with these kinds of techniques result in a time delay before the filter coefficients return to track echo path variations again. Since echo paths continually change due to opening doors, moving persons, changing temperatures etc, echoes can become apparent during this delay.