1. Field of Invention
This invention relates to a method of detecting double-talk and path changes in echo cancellation systems. Echo cancellation is used extensively in telecommunications applications to recondition a wide variety of signals, such as speech, data transmission, and video.
2. Description of Related Art
The search for an effective echo cancellation procedure has produced several different approaches with varying degrees of complexity, cost, and performance. A traditional approach to echo cancellation uses an adaptive filter of length L, where L equals the number of samples necessary to extend to just beyond the duration of the echo. Typically, the adaptive filters contain either 512 or 1024 taps. At the standard telephone bit rate of 8000 samples per second, this provides the ability to adapt to echo paths as long as 64 ms and 128 ms, respectively.
The computational requirements of an adaptive filter are proportional to L for the popular LMS (Least Mean Squares) class of algorithms, and proportional to L2 or higher for algorithms such as RLS (Recursive Least Squares). More robust algorithms (like RLS) have greatly improved convergence characteristics over LMS methods, but the L2 computational load makes them impractical with current technology. For this reason, the LMS algorithm (and its variants) tends to remain the algorithm of choice for echo cancellation.
Practical echo cancellation devices must provide some means of avoiding divergence from double-talk. The double-talk condition arises when there is simultaneous transmission of signals from both sides of the echo canceller due to the presence of near-end speech in addition to the echo. Under such circumstances, the return echo path signal, SIN (see FIG. 1), contains both return echo from the echo source signal, and a double-talk signal. The presence of a double-talk signal will prevent an LMS-based echo canceller from converging on the correct echo path. It will also cause a pre-converged echo canceller to diverge to unpredictable states. Following divergence, the echo canceller will no longer cancel the echo, and must reconverge to the correct solution. Such behaviour is highly unacceptable, and is to be avoided in actual devices. Some means of detecting double-talk must therefore be implemented. To prevent divergence, the LMS filter coefficients are typically frozen during the presence of double-talk.
Detecting double-talk quickly and reliably is a notoriously difficult problem. Even a small amount of divergence in a fully converged LMS filter will result in a significant increase in the residual echo level. The use of a fast and reliable double-talk detector is crucial to maintain adequate subjective performance.
The simplest, and perhaps most common, method for detecting double-talk is to use signal levels. The echo path typically contains a minimum amount of loss, or reduction, in the return signal. This quantity is often referred to as the Echo Return Loss, or ERL. In most systems, this is assumed to be at least 6 dB. In other words, the return signal SIN will be at a level which is at least 6 dB lower than ROUT provided that there is no double-talk. In the presence of double-talk, the level at SIN often increases so that it is no longer 6 dB lower than ROUT. This condition provides a simple and convenient test for double-talk.
The problem with this approach is that the double-talk detector must have an accurate estimate of the echo path ERL in order to determine if the level at SIN is too high. However, precise knowledge of the ERL is generally not available. If the ERL estimate is too high, the double-talk detector may trigger unnecessarily. Conversely, it may not trigger at all if the ERL estimate is too low.
Another problem with this technique is that it will only reliably detect high-level double-talk. If the double-talk signal is at a much lower level than the echo source signal, low-level double-talk occurs. Under this condition, the increase in the level of SIN is usually very small. The double-talk detector may fail to trigger, but noticeable divergence in the LMS filter can still occur.
To detect low-level double-talk, the level of the residual echo signal (SOUT) is often monitored. If no double-talk or background noise is present, and the LMS filter is fully converged, SOUT can be as much as 40 dB lower than ROUT. Assuming that the echo path remains constant, any increase in SOUT will likely be due to double-talk. Of course, if the echo path does change, it will be mistaken for double-talk. So if this method is used, a separate path change detection algorithm must be employed. A unified approach would be simpler and preferred.
Correlation is a statistical function which is commonly used in signal processing. It can provide a measure of the similarity between two signals (cross-correlation), or a single signal and time-shifted versions of itself (autocorrelation). The use of correlation for double-talk detection per se is known. Several patents exist for correlation-based double-talk detection, including U.S. Pat. Nos. 5,646,990, 5,526,347 and 5,193,112. The correlation-based approaches taken in prior-art methods generally involve the calculation of a single cross-correlation coefficient, usually between RIN and SIN. The problem with this technique is that the degree of correlation can vary widely with different signals and echo paths. This makes it very difficult to set thresholds on the correlation coefficient in order to determine what state the echo canceller is in.