I. Field of the Invention
The present invention relates to speech processing. More particularly, the present invention relates to an apparatus and method for echo cancellation that is especially suitable for acoustic echo cancellation.
II. Description of the Related Art
Transmission of voice by digital techniques has become widespread, particularly in cellular telephone and personal communication systems (PCS) applications. This, in turn, has created an interest in improving speech processing techniques. One area in which improvements have been developed is that of echo cancellation.
There are two types of echo cancellers, the network echo canceller and the acoustic echo canceller. A network echo canceller cancels the echo produced in the telephone network. A land-based telephone is connected to a central office by a two wire line to support transmission in both directions. For calls farther than about 35 miles, the two directions of transmission must be segregated onto physically separate wires, resulting in a four-line wire. The device that interfaces the two-wire and four-wire segments is known as a hybrid. An impedance mismatch at the hybrid results in an echo, which must be removed by a network echo canceller. Acoustic echo cancellers are often used in teleconferencing and hands-free telephony applications. For example, an acoustic echo canceller may eliminate acoustic echo resulting from the feedback between a loudspeaker and a microphone.
In FIG. 1, a block diagram of a traditional echo canceller 100 is shown. The echo canceller 100 may be either a network echo canceller or an acoustic echo canceller. Speech signals from the two callers are labeled as far-end speech signal x(n) and near-end speech signal v(n). In a network echo canceller, the reflection of x(n) off the hybrid (not shown) is modeled as passing x(n) through an unknown echo channel 102 to produce the echo signal y(n). In an acoustic echo canceller, having speech signal x(n) broadcast from a loudspeaker and picked up by a microphone is modeled as passing x(n) through the unknown echo channel 102, producing echo signal y(n). Echo signal y(n) is summed at a summer 104 with near-end speech signal v(n). It should be noted that the unknown echo channel 102 and the summer 104 are not included elements in the echo canceller but are artifacts of the system and are illustrated for reference purposes only.
To remove low-frequency background noise, the sum of the echo signal y(n) and the near-end speech signal v(n) is high-pass filtered through a high pass filter (HPF) 106 to produce a signal r(n). The signal r(n) is provided as one input to a summer 108 and to the near-end speech detection unit 110.
The other input of the summer 108 (a subtract input) is coupled to the output of an adaptive filter 112. The adaptive filter 112 receives the far-end speech signal x(n) and a feedback of the echo residual signal e(n) output from the summer 108. In canceling the echo, the adaptive filter 112 continually tracks the impulse response of the echo path, and an echo replica ŷ(n) from the output of HPF 106 is subtracted from the signal r(n) by the summer 108. The adaptive filter 112 also receives a control signal from the near-end speech detection unit 110 so as to freeze the filter adaptation process when near-end speech is detected.
The echo residual signal e(n) is also output to the near-end speech detection unit 110 and a center-clipper 114. The output of the center-clipper 114 is provided as the echo cancellation signal.
Although the adaptive digital filtering performed by the traditional echo canceller is satisfactory, the adaptive filter 112 normally cannot precisely replicate the channel, thus resulting in some residual echo. Furthermore, the residual echo processing by the center-clipper 114 causes a problem in digital cellular and PCS systems. The center-clipper 114 eliminates the residual echo by passing the signal through a nonlinear function that sets to zero any signal portion that falls below a threshold A and passing unchanged any signal segment that lies above the threshold A. Since digital systems may be sensitive to nonlinear effects, center-clipping causes degradation in voice quality.
An exemplary echo canceller which provides high dynamic echo cancellation for improved voice quality, and which addresses the nonlinearity problem, is disclosed in U.S. Pat. No. 5,307,405, entitled xe2x80x9cNETWORK ECHO CANCELLER,xe2x80x9d which is assigned to the assignee of the present invention and incorporated by reference herein, and also in U.S. Pat. No. 5,646,991, entitled xe2x80x9cNOISE REPLACEMENT SYSTEM AND METHOD IN AN ECHO CANCELLER,xe2x80x9d also assigned to the assignee of the present invention and incorporated by reference herein.
The echo canceller of U.S. Pat. Nos. 5,307,405 and 5,646,991 makes use of at least two adaptive filters for obtaining a better estimate of the echo. One filter performs the echo cancellation, while another filter performs state determination by keeping track of the presence of near-end and far-end speech. A noise analysis/synthesis feature eliminates the non-linear effects of the center-clipper by replacing the echo residual signal with a synthesized noise signal when appropriate.
The echo canceller of U.S. Pat. Nos. 5,307,405 and 5,646,991 may be used for both network and acoustic echo cancellation, although it is more suitable for use as a network echo canceller. Network echo cancellers cancel echoes due to hybrids. Because the echo caused by hybrids has a long delay, the adaptive filters are generally required to have a large number of filter tap coefficients to accommodate the long delay. For example, an adaptive filter having 256 filter tap coefficients may be suitable. The large number of filter tap coefficients provides for accuracy in estimating and canceling the echo, but also imposes high processing power requirements. The use of multiple adaptive filters further increases processing power requirements. The high processing power is generally available in a central station, where a network echo canceller may be implemented. Thus, an echo canceller having high processing power requirements may be suitable for network echo cancellation applications.
However, for applications having limited processing power, an echo canceller characterized by multiple adaptive filters with a large number of filter taps will not be suitable. One application in which processing power is generally limited is that of a mobile telephone. In a mobile telephone, acoustic echo cancellation may be necessary to cancel echo resulting from the feedback between the loudspeaker and the microphone. Also known as the ear seal echo, the echo is the leaking far-end voice picked up by the microphone through the acoustic channel on the near-end (mobile side). To prevent the echo from being delivered back to the far-end speaker, echo cancellation is necessary. The echo canceller must be able to cancel acoustic echo with a high degree of precision. Furthermore, the echo cancellation must be performed using limited resources. These problems and deficiencies are recognized and solved by the present invention in the manner described below.
The present invention is an improved apparatus and method for echo cancellation. The echo canceller of the present invention may be implemented in systems having limited processing resources. The echo canceller comprises an adaptive filter that tracks the impulse response of the echo path and produces an estimate of the echo. Filter adaptation is controlled by a controller based on the rate of the far-end speech signal, the rate of the near-end signal, an acoustic loss measure, and a double talk hangover indicator. A rate estimator determines the rate of the far-end speech signal and the rate of the near-end signal. The rate at which a frame of data is encoded in a variable rate communications system may be indicative of the presence or absence of speech. An acoustic loss unit measures the acoustic loss, defined to be the energy of the far-end speech signal divided by the energy of the near-end signal. A double talk hangover unit determines the double talk hangover indicator. The double talk hangover indicator is set to prevent filter adaptation when both the near-end and the far-end are active or when the near-end is active but the far-end is inactive. To more accurately determine the status of the near-end and the status of the far-end, the double talk hangover indicator may also be based on the acoustic loss measure and the status of a timer.
The controller may also comprise a step size adaptation unit for determining the adaptation step size of the adaptive filter. The step size may be increased for faster adaptation when it is determined that the adaptive filter has not yet converged.
In addition, the controller may comprise a noise replacement unit. In a situation where only the far-end speaker is talking, it may be desirable to output comfort noise instead of the echo residual signal to ensure echo is completely rejected. To prevent the far-end speaker from detecting any change in signal characteristics, a comfort noise generator synthesizes noise to match the power and characteristics of the actual background noise. The noise replacement unit generates a control signal to specify the replacement of the echo residual signal by comfort noise.