1. Technical Field of the Invention
The present invention relates in general to the communications field and, in particular, to echo cancelation in communications systems.
2. Description of Related Art and Objects of the Invention
Telecommunications provides the ability for one person or group to communicate with another person or group over great distance. In many telecommunications systems, for example landline and wireless telephone systems, voice signals are often transmitted between two system users via a bi-directional communications link. In such systems, speech of a near-end user is typically detected by a near-end microphone at one end of the communications link and then transmitted over the link to a far-end loudspeaker for reproduction and presentation to a far-end user. Conversely, speech of the far-end user is detected by a far-end microphone and then transmitted via the communications link to a near-end loudspeaker for reproduction and presentation to the near-end user.
At either end of the communications link, loudspeaker output detected by a proximate microphone may be inadvertently transmitted back over the communications link, resulting in what may be unacceptably disruptive feedback, or echo, from a user perspective. Furthermore, if the round-trip loop gain is greater than unity at any audible frequency, then the system will tend to xe2x80x9chowl,xe2x80x9d i.e., tend to produce a high-pitched whine from feedback effects, as is well known in the art.
Therefore, in order to avoid transmission of such undesirable echo signals, the microphone acoustic input should be isolated from loudspeaker output as much as possible. With a conventional telephone handset, in which the handset microphone is situated close to the user""s mouth while the handset loudspeaker essentially covers the user""s ear, the requisite isolation is easily achieved. However, as the physical size of portable telephones has decreased, and as handsfree speaker-phones have become more popular, manufacturers have moved toward designs in which the acoustic path from the loudspeaker to the microphone is not blocked by the user""s head or body. As a result, the need for more sophisticated echo suppression techniques has become paramount in modern systems.
The need is particularly pronounced in the case of handsfree automobile telephones, where the closed vehicular environment can cause multiple reflections of a loudspeaker signal to be coupled back to a high-gain handsfree microphone. Movement of the user in the vehicle and changes in the relative directions and strengths of the echo signals, for example as windows are opened and closed or as the user moves his head while driving, further complicate the task of echo suppression in the automobile environment. Additionally, more recently developed digital telephones process speech signals through vocoders which introduce significant signal delays and create non-linear signal distortions. As is well known, these prolonged delays tend to magnify the problem of signal echo from a user perspective, and the additional nonlinear distortions can make echo suppression difficult once a speech signal has passed through a vocoder.
Considering, as a specific example, a vehicle-mounted handsfree accessory, the near-end microphone is typically about 12 inches from the near-end user""s mouth. For the microphone to be sensitive enough to pick up the user""s speech, it also is sensitive enough to easily pick up the sound coming from the loudspeaker and any noise inside the car. Without acoustic-echo suppression, the far-end user hears his or her own voice coming back to the near-end microphone as it bounces around inside the car after being broadcast from the loudspeaker. This unsuppressed acoustic echo is so annoying to the far-end user as to make it impossible for him or her to converse.
Thus, an ideal acoustic-echo suppressor prevents the far-end user from hearing the echo of his or her own voice while at the same time permitting natural, full-duplex conversation. However, because the automobile environment is especially challenging for an acoustic-echo suppressor to meet this goal, prior art methods have proven less than ideal.
The automobile environment is particularly difficult for a number of additional reasons. First, double-talk situations occur frequently because people often give verbal feedback while listening. Second, the typical signal processing delays associated with digital systems require that the echo suppression be very high (e.g., 45 dB for single talk and 25 dB for double talk). Third, the reverberation inside an automobile typically takes about 50 ms to decay by 45 dB, and installations vary in the position of the microphone relative to the loudspeaker.
Further reasons that acoustic echo suppression in the context of handsfree automobile telephony is especially difficult include: the signal-to-noise ratio for the mobile user""s speech can be as low as 0 dB; the echo from the loudspeaker to the microphone can be louder than the mobile user""s voice into the microphone; the far-end signal can be very noisy in the context of a handsfree-to-handsfree call or where the radio frequency reception between users is of poor quality; the echo path between the loudspeaker and the microphone changes constantly as the mobile user moves around, and such change is significant because the mobile user""s head is typically the main obstacle or the main reflection surface between the loudspeaker and the microphone; the echo path is non-linear due to loudspeaker distortion; and the voice signal used to train the echo suppressor has periodic components within vowel sounds which create a temporary echo-path-phase ambiguity.
In addition to acoustic-type echo suppression, network-type echo suppression is also desirable in the context of mobile telephony so that, for example, a mobile user does not hear his or her own voice echoed back through a loudspeaker in the case of analog (e.g., AMPS) calls. In other words, unlike digital systems (e.g., D-AMPS and GSM), many analog systems do not cancel echoes caused by the impedance mismatch of the 4-to-2-wire hybrid typically located at the central office of a public switched telephone network (PSTN). Additionally, handsfree accessory system code can introduce an extra 4-10 msec of delay, and a digital phone can introduce an extra 4 msec of round-trip delay. Therefore, network echo is particularly perceptible with a vehicle handsfree accessory.
Network-type echo cancelation in the context of mobile telephony presents other problems as well. For example, because the network echo is different for every call, adaptive filter coefficients should not be reused, and adaptation should be extremely fast. Additionally, a network-echo suppressor should re-adapt quickly after a cellular hand-off to an analog cell, and it should be disabled after a hand-off to a digital cell. Advantageously, the teachings of the present invention may be utilized to optimize such a network-echo canceler.
In summary, echo cancelers can be used in telephony systems to reduce or eliminate annoying echo effects. For example, in cellular Public Land Mobile Networks (PLMNs), echo cancelers are used in mobile services switching centers (MSCs) to suppress or remove echoes in speech traffic. Echo cancelers are also used in mobile radiotelephones and handsfree telephone equipment to compensate for acoustical echoes. Finally, echo cancelers are employed within the PSTN to reduce or eliminate echos arising from impedance mismatches.
Referring now to FIG. 1, a simplified schematic block diagram of a conventional echo canceler 100 is illustrated. An echo path is denoted by 110 and represents speech signal(s) being reflected back to the far-end user (not pictured) The main component of such a conventional echo canceler 100 is an adaptive finite-impulse-response (FIR) filter 120. Under the control of an adaptation algorithm (e.g., in software), the filter 120 models the impulse response of the echo path.
A non-linear processor (NLP) 130 is used to remove residual echo that may remain after linear processing of the input signal. The block xe2x80x9cHxe2x80x9d denoted by 140 represents the echo source in the telephony system which passes the xe2x80x9cdesiredxe2x80x9d signal from a near-end user (not-pictured) . A signal combiner 150 is used to subtract out the unwanted echo component, as estimated by the filter 120, from the xe2x80x9cdesiredxe2x80x9d signal. A feedback signal 160 provides control feedback from the output of the signal combiner 150 to an input of the filter 120. The resulting signal after the signal combiner 150 (and especially after the NLP 130) has, hopefully, no echo component.
Unfortunately, the ability of echo cancelers to cancel the echo component from the xe2x80x9cdesiredxe2x80x9d signal is heavily dependent on the quality of the algorithm used in the filter 120. One algorithm used in existing systems is the Least Mean Square (LMS) algorithm; another is the Normalized LMS (NLMS) algorithm. These algorithms are used to adapt the filtering process that occurs within the filter 120, but prior art implementations of these algorithms have been deficient in several areas.
For example, both the LMS and the NLMS algorithm require that an update gain (the filtering in the filter 120 is being updated) be selected and fixed for a given installation. Selecting this update gain demands various tradeoffs in performance. If the fixed update gain is set so that the algorithm is stable when the gain of the echo channel is very low, then that setting of the fixed update gain causes slow adaptation when the gain of the echo channel is high. On the other hand, if the fixed update gain is set so that the filter adapts quickly when the gain of the echo channel is high, then that setting causes instability in the system when the gain of the echo channel is very low.
As an additional example, the conventional NLMS algorithm produces high update gain for small reference signals even though the resulting echo may be overwhelmed by noise at the microphone. Consequently, the algorithm either is unstable or must be slowed down at all times (by reducing the fixed update gain) to handle this possibility. In either situation, the prior art algorithm is sub-optimal.
In summary, while existing systems have heretofore used the LMS and NLMS algorithms in the modeling of echo signals, such existing systems have done so only non-optimally. The present invention optimizes the algorithms by achieving the following (and other) objects of the invention:
An object of the invention is to provide an adaptation algorithm that has an overall update gain that is proportional to the gain of the echo channel.
Another object of the invention is to provide an adaptation algorithm that can incorporate a higher nominal update gain by specifically accounting for situations with, for example, small reference signals at a loudspeaker and high noise at a corresponding microphone.
The present invention fulfills the above-described and other needs by providing optimizations for use in echo cancelers. Echo canceling devices constructed in accordance with the teachings of the present invention include an adaptive finite impulse response (FIR) filter for estimating a transfer function of an echo channel in a communications link. Optimized versions of the Least Mean Square (LMS) and Normalized LMS (NLMS) algorithms are used to adapt the filter coefficients of the estimated transfer functions.
In a first embodiment, the echo channel energy gain is included in the LMS or NLMS update equation to increase the speed at which the coefficients of the transfer function are updated. This enables adaptation speed to be proportional to the channel energy gain. The teachings of the present invention provide an algorithm for estimating the echo channel energy gain and for adapting the estimate based on measured system parameters, such as a measured instantaneous channel gain and a near-end voice level.
In a second embodiment, the average energy of either the microphone signal (in an acoustic echo canceler implementation, for example) or the error signal, as well as the standard reference signal, are included in the NLMS update equation. As a result, when noise into a microphone is high and the standard reference signal is small, the overall update gain is lower than that of the standard NLMS. This embodiment permits the use of a higher nominal fixed update gain; consequently, the algorithm converges more quickly.
In a third embodiment, both (i) the echo channel energy gain and (ii) the average energy of either the microphone signal or the error signal, as well as the standard reference signal, are included in the update equation. This third embodiment, therefore, enables both a proportional adaption speed and a higher nominal update gain.