The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Echo cancellers (ECANs) used in telecommunication networks generally consist of two sound level reduction components: a convolution processor (CP) and a non-linear processor (NLP). The CP and the NLP operate in different ways depending upon whether multiple parties are speaking simultaneously in a call, or if there is just one party speaking at a time.
When only one party in a point-to-point connection is talking at a time, a CP controller compares the signal in the two directions, forms an estimate of the echo signal (via estimation of the echo path impulse response) and then injects the negative of its echo estimate into the return path to eliminate the echo. As long as the echo signal is a linear and time-invariant function of the original signal, and is within the time range of the ECAN, the CP can effectively cancel the echo.
Since the coder-decoder (codec) circuits used in telecommunication networks may be non-linear, introduce distortion, and/or may not be time-invariant, the cancellation is imperfect. Therefore, an NLP is coupled in the circuit after the CP to eliminate any residual echo. The NLP acts on the output of the CP by attenuating the residual echo so as to make it inaudible.
If both parties in a point-to-point telephone connection speak at the same time, a condition called double-talk, the ECAN operates in a different mode than when only one party talks at a time. During double-talk, the NLP is eliminated from the transmission path, because otherwise the NLP could seriously degrade the near-end speech due to the added attenuation introduced by the NLP. Further, the CP controller typically stops updating its estimate of the echo path impulse response. The decision to eliminate the NLP and to stop updating the impulse response estimate for the CP is made by a Double-talk Detector (DTD) circuit or algorithm. The DTD is a signal processing control function that is typically a part of the ECAN.
During double-talk, as long as there are only two parties in a telephone connection, the fact that the echo canceller is working considerably less effectively, because the CP impulse response estimate updating has stopped and because the NLP has been eliminated, is not noticed. When two people are talking simultaneously, each person is typically less attentive to echo than when only one is person is talking, in part because the speech of the other person masks the echo of each talker.
However, the foregoing approach is not fully effective in achieving echo cancellation when the echo path impulse response is beyond the coverage length of the existing CP or when the echo path is not linear and time invariant, which occurs with many low-bit rate codecs.
Known manufacturers of voice conferencing systems include Biamp (AudiaFlex and Nexia), Clearone (XAP), Polycom (Vortex), Avaya Meeting Exchange (formerly made by Spectel), Radvision (Scopia), and Compunetix (Contex). None are known to address network echo control as described herein.