Speech typically results in reflected waves. When the reflected wave arrives a very short time after a direct sound, it is perceived as a spectral distortion or reverberation. However, when the reflection arrives a few tens of milliseconds (ms) after the direct sound (i.e., a relatively long period of time), it is heard as a distinct echo. Such echoes may be annoying, and under extreme conditions can completely disrupt a conversation.
Line echoes (i.e., electrical echoes) typically occur in telecommunications networks due to impedance mismatches at hybrid transformers that couple two-wire local customer loops to four-wire long-distance trunks. Ideally, the hybrid transformers pass the far-side signal at the four-wire receive port through to the two-wire transmit port without allowing leakage into the four-wire transmit port. However, this typically requires knowledge of the impedance seen at the two-wire ports, which in practice varies widely and can only be estimated. As a result, the leaking signal returns to the far-side talker as an echo. The situation can be further complicated by the presence of two-wire toll switches, allowing intermediate four-two-four wire conversions internal to the network. In telephone connections using satellite links with round-trip delays on the order of 600 ms, line echoes can become particularly disruptive.
Acoustic echoes, on the other hand, typically occur in telecommunications networks due to acoustic coupling between, for instance, a loudspeaker and a microphone (e.g., in a speakerphone) During a teleconference, where two or more parties are connected by a full-duplex link, an acoustic reflection of the far-side talker through the near-side conference room is returned to the far-side talker as an echo. Acoustic echo cancellation tends to be more difficult than line echo cancellation since the duration of the acoustic echo is usually several times longer (100∝400 ms) than typical electrical line echoes (20 ms). In addition, the acoustic echo may change rapidly at any time due to opening doors, moving persons, changing temperatures, etc., within the conference room. In other words, environmental factors may tend to exacerbate the acoustic echo heard through such devices making them more problematic to offset than their line echo counterparts.
Echo suppressors have been developed to control line echoes in telecommunications networks. Unfortunately, echo suppressors are generally ineffective during “double-talk” when talkers at both ends are talking simultaneously. During double-talk, the four-wire transmit port carries both the near-side signal and the far-side echo signal. Furthermore, echo suppressors tend to produce speech clipping, especially during long delays caused by satellite links.
Echo cancellers have been developed to overcome the shortcomings of echo suppressors. Typically, single-path echo cancellers include an adaptive filter and a subtracter. In operation, an incoming signal, for example in a conventional speakerphone, is received from a far-side talker and is heard through a speaker by a near-side talker. Unfortunately, the incoming signal is also received through the near-side microphone, which is typically positioned close to the near-side speaker. The incoming signal heard back through the near-side microphone results in an acoustic echo, which is then heard by the far-side talker. To combat this echo, the incoming signal is also applied to the adaptive filter when it first enters the echo canceller; the adaptive filter generates a replica signal of the incoming signal in an attempt to model the echo signal. To accomplish this, the replica signal and the intended outgoing signal, which includes the echo signal, are applied to the subtracter. The subtracter subtracts the replica signal from the outgoing signal in an effort to eliminate or “cancel” the echo signal.
The resulting signal, after the cancellation, is called an error signal, since it may be analyzed to determine how much of the echo signal remains after cancellation. The error signal is fed back to the adaptive filter, which adjusts its internal filter coefficients in order to maximize cancellation of the echo signal and minimize the error signal. In this manner, the filter coefficients converge (hence, an “adaptive” filter) toward values that optimize the replica signal in order to cancel, at least as much as possible, the echo signal.
However, during double-talk, speech at the near-side that acts as uncorrelated noise causes the filter coefficients in an adaptive filter to diverge (or drift). In an effort to alleviate this problem, double-talk detectors, which are commonly known in the art, are often used for disabling the “adaptation” (or converging) during double-talk. Unfortunately, double-talk detectors typically fail to indicate the presence of double-talk for a time period (e.g., a whole syllable) after double-talk begins. During this time period, the filter coefficients may continue to adapt, causing unwanted divergence or drift. Furthermore, double-talk becomes increasingly difficult to detect with such devices as an acoustic echo becomes large in comparison to the near-side signal.
To overcome this double-talk problem, two-path adaptive echo cancellers have been introduced in telecommunications networks. Typical two-path echo cancellers include a nonadaptive filter and an adaptive filter coupled in parallel. (See, for instance, U.S. Pat. No. 5,664,011, entitled “Echo Canceller with Adaptive and Non-adaptive Filter” to Crochiere, et al., which is incorporated by reference) In the two-path echo canceller structure, both filters operate on the same input signals with the intent to cancel the same echo signal. The error signal of the nonadaptive filter (e.g., the foreground filter) serves as the output for the entire structure, while the error signal of the adaptive filter (e.g., the background filter) is used only for control. The coefficients of the foreground filter are quasi-static and non-adaptive. The coefficients of the background filter are continuously adapting, as in a conventional single-path echo canceller, as described above. The background filter coefficients are used to “update” (e.g., replace) the coefficients of the foreground filter when the performance of the background filter is judged to be better than that of the foreground filter.
A primary benefit of the two-path canceller structure lies in its ability to perform very well in the presence of double-talk. A two-path canceller was introduced by Ochiai, et al. in “Echo Canceller with Two Echo Path Models,” IEEE Trans. Commun., Vol. COM-25, No. 6, pp. 589–595, June 1977, which is incorporated herein by reference in its entirety. As disclosed therein, the two-path echo canceller offers a solution to the problem of double-talk detection. More specifically, since the background adaptive filter is not in the audio path of the echo canceller, any degradation of its adaptive coefficients does not directly affect the performance of the foreground filter (unless those coefficients are erroneously copied to the foreground filter), whose output signal is used as the output of the echo canceller. This is in stark contrast to conventional single-path adaptive echo cancellers, wherein any degradation of the adaptive filter's estimate of the echo signal results in an immediate increase in the residual echo level at the echo canceller's output.
Although solving the double-talk problem in theory, where all the conditions are ideal, such two-path echo cancellers have proven problematic to implement in practical situations where more heuristic comparisons are required. More specifically, prior art decision logic algorithms used for permitting or denying the copy of filter coefficients from an adaptive filter to a nonadaptive filter have proven to be extremely difficult to generate. As noted above, conventional two-path echo cancellers can solve some of the double-talk degradation problems, but only if the filter coefficients are properly updated. Unfortunately, conventional echo cancellers typically incorporate imprecise decision logic algorithms resulting in the erroneous copying of negatively adapted filter coefficients, thus degrading the echo cancellation capabilities of the echo canceller, during double-talk as well as other real-world situations.
At present, the decision logic algorithms found in conventional echo cancellers are ill-suited for the precise decisions required since they rely on multiple user-defined constants, such as decision thresholds and timers. Unfortunately, these constants vary as the application for the echo canceller vary. For example, one set of constants would be defined if the echo canceller is used in an application where predominately acoustic echo is present, whereas a different set of constants is defined in applications where predominately line echo is present. In addition, these prior art algorithms typically incorporate signal-level comparison tests for double-talk detection. As a result, these algorithms do not apply to those situations where an echo signal induces positive signal gain in the outgoing signal, a situation common in the presence of acoustic echo.
The increasing use of teleconferencing systems and desktop conferencing, and even general speakerphone use, where echo cancellers play a significant role, has led to the requirement of faster and better performing decision logic algorithms. In these and other applications there is a desire to have far better sound quality and sound localization than what has thus far been provided in the prior art, especially in double-talk situations. Accordingly, what is needed in the art is an improved system, and related method, for updating filter coefficients in echo cancellers that does not suffer from the deficiencies found in the prior art.