Echo cancellation has been used extensively in telecommunications applications to recondition a wide variety of signals, such as speech, data transmission, and video. The search for mathematical algorithms to perform echo cancellation has produced many different approaches with varying degrees of complexity, cost, and performance. One common approach known as sub-band processing, involves separating the speech signal into frequency bands and processing each band separately. This has some inherent advantages, most notably reduced computational complexity and increased convergence speed over most other approaches, although practical problems have hampered its use for echo cancellation in the past.
In recent years a new method of separating signals into "sub-bands" has been developed, called wavelet decomposition and reconstruction. Wavelet decomposition is a process of band splitting, and "down sampling" (ie. the process of reducing or decimating the sample rate) of a signal into "wavelet packets". Wavelet reconstruction is the process of "up-sampling" (ie. the process of increasing the sample rate of a signal--usually done by zero insertion and anti-aliasing filtering) and re-combining the "wavelet packets" to re-generate the original signal. Wavelet decomposition and reconstruction allows the original signal to be re-generated after re-construction without distortion or degradation. This method has become popular for many different signal processing applications. However, the application of wavelet decomposition and reconstruction to the problem of echo cancellation has proven unsuccessful to date, for reasons discussed in greater detail below.
In some applications, such as the cancellation of acoustic speech echoes, the echo duration can be extremely long, on the order of 100 msec to 500 msec. A traditional approach to echo cancellation uses an adaptive transversal filter of length M, where M equals the number of samples necessary to extend just beyond the duration of the echo. The computational requirement to implement such a filter is proportional to 2M for the popular LMS (Least Mean Squares) class of algorithm, and proportional to M.sup.2 or higher for algorithms such as RLS (Recursive Least Squares). The more robust algorithms (RLS being one example) have improved convergence characteristics over prior art algorithms, but the computational load increases dramatically with M. Furthermore, the convergence time increases proportionally with the size of M for most algorithms. Fast convergence is an important criteria for echo cancellation, especially for acoustic speech echo cancellation since the echo path may be continually changing as people and objects move within the environment. Prior art echo cancellers employing adaptive transversal filters capable of eliminating an echo signal having duration of 500 msec or more, have been found to suffer from problems of excessive computational complexity as well slow convergence speed.
Sub-band processing is thought to be an attractive alternative to the use of a single adaptive transversal filter for acoustic speech echo cancellation because it reduces computational complexity. By dividing the signal into N sub-bands, a filter bank of N adaptive filters must be used instead of only one. However, the sub-band signals can be down-sampled by a factor of N. Consequently, the filter outputs need only be calculated 1/N as often. Additionally, the length of the filters themselves may be reduced from length M to length M/N. This has the overall effect of reducing the computational complexity to something on the order of 2M/N for LMS type adaptive filters, while also improving convergence behaviour due to the use of shorter LMS filters. It can be seen that when M is large, there is a significant reduction in computational load, making the overhead necessary for implementing the filter banks insignificant.
The problem with the filter bank approach to sub-band processing, is that the transition between bands makes it impossible to perfectly isolate each band from each adjacent band without the use of "ideal" band pass filters (ie. filters with infinitely sharp cutoff). There is a known trade-off between the amount of echo cancellation possible, the filter roll-off, filter group delay distortion, and reconstructability of the sub-bands to regenerate the original input signal without distortion. A type of filter known as a QMF (Quadrature Mirror Filter) provides one method of filter bank design that has been used in the past to help overcome these problems. The QMF is a type of filter designed to band-split a signal, and then recombine the bands without distortion of the signal. However, the use of QMFs for echo cancellation suffers from problems relating to distortion caused by aliasing, as discussed in greater detail below.
Wavelet decomposition and reconstruction allows a sampled data signal to be separated into separate wavelet "packets" for echo cancellation, and thereafter allow for reconstruction of the original signal without any added distortion or signal degradation. In fact, the original signal can be perfectly reconstructed.
One problem with echo cancellation using wavelet decomposition is that, as indicated above, the down sampling process creates distortion in the wavelet packets due to aliasing. This effect causes the echo channel to be time-varying, which is a violation of the underlying assumption of time invariance which is required to apply known methods of adaptive filtering for speech echo cancellation. The echo channel must be both linear and time-invariant. Any processing done on the wavelet packets where the echo channel is time-varying invalidates the echo cancellation process such that signal distortion results. This limits the amount of overall allowable echo cancellation using the method of wavelet decomposition and reconstruction.