1. Field of the Invention
This invention relates generally to echo cancellers and more particularly to a novel circuit and method for detecting near-end talking activity and double talk situations thereby to effectively control the convergence function of an acoustic echo canceller (AEC).
2. Description of Background and Related Art
The presence of echoes in long-distance telephony is a thorny problem. A long-distance communication circuit usually comprises four-wire and two-wire segments; these are joined at each end by hybrid circuits. Impedance mismatch in a hybrid circuit causes a portion of the signal received at the hybrid circuit to be reflected back onto the transmit four-wire segment whence it came and this reflected signal is perceived as echo to the speaker who originated it. Adaptive echo cancellers are thus employed to minimize the echo signal created on four-wire transmission lines.
Normally, a four-wire receive signal is at a higher level than its echo signal on the four-wire transmit path because there is loss across the hybrid circuit. Near end speech on the transmit path will therefore typically be stronger than the echo signal. However, near end speech is unwanted noise as far as convergence of the echo canceller is concerned since it would diverge the canceller if it were to continue updating its estimated impulse response while near end speech is present. Various techniques and schemes have therefore been developed to provide double talk detection in echo cancellers.
The problem of echo cancellation on the telephone network is exacerbated by the connection of a handsfree terminal at one or both ends of the transmission line. Double talk detection (DTD) in a handsfree system refers to the determination of whether, in the microphone output, there is near-end speech mixed with a probably much stronger far-end speech played through the loudspeaker. By comparison with double talk detection for network echo cancellation applications, the DTD in an acoustic echo cancellation handsfree (ECHF) system is more likely to be subjected to sudden echo path changes as well as to echo levels that are much above the level of near-end speech. This is due to the fact that, in a handsfree terminal, the receive transducer or speaker is closer to the microphone of the terminal than the near-end user; furthermore, the case of the terminal conducts a substantial amount of acoustic energy from speaker to microphone. In a typical implementation, it is not unusual for the far-end signal from the loudspeaker to be as large as 25 db (decibel) above the level of the near-end signal. The near-end activity by a user is therefore difficult to ascertain because the far-end signal from the loudspeaker will mask at least a portion of the signal from the near-end user.
Numerous schemes of double talk detection have been devised and usually fall into one of three categories. A first category which may be labelled the energy comparison scheme usually employs power detectors for detecting the average power, the peak power and the residual power of various signals to generate the output signal of the double talk detector. Example circuits of this type of double talk detectors are described in U.S. Pat. Nos. 4,360,712; 5,463,618 and 4,645,883.
A second category which may be labelled a cross correlation technique is basically an extension of the energy comparison category; it adds a cross-correlation criterion between various signals to arrive at a control decision. This scheme is more complicated than the energy comparison technique and requires additional memories and computational power. Examples of this type of double talk detection may be found in U.S. Pat. Nos. 5,646,990 and 5,193,112.
Yet a third category is related to the cross correlation technique. It monitors the directions of the updating vectors for the echo canceller which are given by an adaptation algorithm such as NLMS (Normalized Least Mean Square). If the updating vectors over a number of samples all roughly point at a common direction, the echo canceller is in the converging mode. If, on the other hand, the vectors point at various diverse directions, the echo canceller is deemed to have converged. This decision process together with the energy of the signals are then used to determine whether a double talk condition exists. This scheme may provide a reliable result but is very computation intensive. DTD implementations based on monitoring updating vectors may be found in U.S. Pat. No. 4,918,727 as well as the paper: "A New Double-Talk Detection Algorithm Based On The Orthogonality Theorem" by Hua Ye and Bo-Xiu Wu, IEEE Transactions on Communications, Vol. 39, No. 11, November 1991.
Most of the known techniques and schemes of the prior art were developed for use in network echo cancellations and probably perform adequately in that environment; however, their performance in ECHF applications is not entirely satisfactory. The main reason, as mentioned above, is that in a handsfree environment, the portion of the far-end signal from the loudspeaker appearing as echo at the microphone of the terminal is usually much stronger than the near-end signal, and the difference in a typical implementation can be as large as about 25 db. The far-end signal from the loudspeaker tends to mask the signal from the near-end user and makes the determination of double talk conditions very difficult using the known techniques.