The present invention relates to communications systems, and more particularly, to echo suppression in bi-directional communications systems.
Adaptive filtering arrangements are prevalent in communications systems of today. Such arrangements are typically used to reduce or remove unwanted signal components and/or to control or enhance signal components of interest.
A common example of such a filtering arrangement relates to hands-free telephony, wherein the built-in earphone and microphone of a conventional telephone handset are replaced with an external loudspeaker and an external microphone, respectively, so that the telephone user can converse without having to physically hold the telephone unit in hand. Since sound emanating from the external loudspeaker can be picked up by the external microphone, adaptive filtering is commonly performed in order to prevent the loudspeaker output from echoing back and annoying the far-end user at the other end of the conversation. This type of adaptive filtering, or echo canceling, has become a basic feature of the full-duplex, hands-free communications devices of today.
Typically, echo cancelation is achieved by passing the loudspeaker signal through an adaptive Finite Impulse Response (FIR) filter which approximates, or models, the acoustic echo path between the hands-free loudspeaker and the hands-free microphone (e.g., a passenger cabin in an automobile hands-free telephony application). The FIR filter thus provides an echo estimate which can be removed from the microphone output signal prior to transmission to the far-end user. The filtering characteristic (i.e., the set of FIR coefficients) of the adaptive FIR filter is dynamically and continuously adjusted, based on both the loudspeaker input and the echo-canceled microphone output, to provide a close approximation to the echo path and to track changes in the echo path (e.g., when a near-end user of an automobile hands-free telephone shifts position within the passenger cabin).
Adjustment of the filtering characteristic is commonly achieved using a form of the well known Least Mean Square (LMS) adaptation algorithm developed by Widrow and Huff in 1960. The LMS algorithm is a least square stochastic gradient step method which, as it is both efficient and robust, is often used in many real-time applications. The LMS algorithm and its well known variations (e.g., the Normalized LMS, or NLMS algorithm) do have certain drawbacks, however. For example, the LMS and other known algorithms can sometimes be slow to converge (i.e., approach the target filtering characteristic, such as the acoustic echo path in a hands-free telephony application), particularly when the algorithm is adapted, or trained, based on a non-white, or colored, input signal.
As a result, echo cancelers utilizing the LMS or other adaptive algorithms can temporarily allow significant residual echo to pass back to a far-end user whenever the true echo path is changing or unknown (e.g., upon first installation of a handsfree device). Moreover, known adaptive algorithms tend to diverge during periods in which the far-end user is not speaking (i.e., when the energy content of the loudspeaker signal is insufficient to provide a basis for developing a quality echo estimate). Consequently, significant residual echo can also be temporarily passed back to the far-end user each time the far-end user begins to speak after having been silent for a period of time.
To reduce residual echo immediately following first installation of a device in a new and acoustically unknown environment, conventional systems often employ an initialization sequence to train the echo canceler before the device is used for actual communications. Specifically, an artificial audio signal (typically white noise) is played through the loudspeaker, and the echo canceler is given time to converge to the new echo path prior to a first call being made or received. However, such an approach does not address the above described problems associated with slow filter recovery time following changes in the echo path or following periods of far-end user silence. Thus, there is a need for improved methods and apparatus for providing echo cancelation in communications systems.
The present invention fulfills the above-described and other needs by providing echo canceling techniques wherein a noise signal is selectively added to a downlink (e.g., loudspeaker) signal to thereby improve adaptive filter convergence speed. Advantageously, the added signal content enables an echo-canceling adaptive filter to more quickly track echo path changes during user communications. Moreover, the added noise prevents divergence of the adaptive filter during periods in which the downlink signal does not contain information (e.g., far-end speech) sufficient for developing a good estimate of the echo path.
To prevent performance degradation from a user perspective, the added noise can be made to resemble existing system noise (e.g., either near-end or far-end background noise). For example, the power spectrum and level of existing near-end noise can be estimated in real-time, and the added noise can be generated having a similar or identical spectrum and a somewhat lower power level. Thus, a system constructed according to the invention can provide the benefits of enhanced filter convergence speed without creating user-perceptible differences in overall system operation.
Advantageously, filter adaptation speed can be further enhanced according to the invention by modifying characteristics of the added noise at appropriate times. For example, the spectrum of the added noise can be spread, and the power level of the added noise can be increased, whenever the near-end user of a handsfree telephone is speaking (and therefore effectively masking the added noise). Whitening and strengthening the added noise signal enables the adaptive echo-canceler to better identify the true echo path, and doing so only when the near-end user is speaking results in no performance degradation (i.e., no user-perceptible changes in overall system operation).
An exemplary communications device according to the invention includes an adaptive echo canceler receiving a near-end audio signal and providing an echo-canceled near-end signal for transmission to a far-end user via a communications channel, adaptive filtering coefficients of the adaptive echo canceler being dynamically adjusted in dependence upon the echo-canceled near-end signal and upon a reference signal. The exemplary communications device further includes a noise estimation processor receiving a far-end audio signal via the communications channel and providing the reference signal to the adaptive echo canceler, the noise estimation processor producing the reference signal by selectively adding noise to the far-end audio signal. For example, the noise estimation processor can include a voice activity detector for determining whether the far-end audio signal includes speech of the far-end user, and can thus add noise to the far-end audio signal only during periods in which the far-end audio signal does not include speech of the far-end user.
To prevent degradation of system performance from the user perspective, the noise added to the far-end audio signal can be an estimate of noise present in one of the near-end and far-end environments of the device. Additionally, a level of the noise added to the far-end audio signal can be made slightly less than an estimated level of the noise present in the near-end environment.
To further improve adaptive echo canceler performance, the noise estimation processor can include a voice activity detector for determining whether the near-end audio signal includes speech of the near-end user. Thus, the noise estimation processor can modify the noise added to the far-end audio signal when the near-end audio signal includes speech of the near-end user. For example, the noise added to the far-end audio signal can be whitened when the near-end audio signal includes speech of the near-end user. Additionally, a power level of the noise added to the far-end audio signal can be increased when the near-end audio signal includes speech of the near-end user.
An exemplary method of echo suppression according to the invention includes the steps of filtering a near-end audio signal to provide an echo-canceled near-end signal for transmission to a far-end user via a communications channel, dynamically adjusting filtering coefficients used in the filtering step in dependence upon the echo-canceled near-end signal and upon a reference signal, and selectively adding noise to a far-end audio signal to provide the reference signal used in the adjusting step. The step of selectively adding noise can, for example, include the steps of determining whether the far-end audio signal includes speech of the far-end user, and adding noise to the far-end audio signal only during periods in which the far-end audio signal does not include far-end user speech.
Additionally, the step of adding noise to the far-end audio signal can include the steps of estimating noise present in one of a near-end and a far-end environment, and adding noise which is audibly similar to the estimated noise to the far-end audio signal. A level of the noise added to the far-end audio signal can, for example, be made slightly less than an estimated level of the noise present in the near-end environment.
The exemplary method can further include the steps of determining whether the near-end audio signal includes speech of the near-end user, and modifying the noise added to the far-end audio signal when the near-end audio signal includes speech of the near-end user. For example, the added noise can be whitened when the near-end audio signal includes speech of the near-end user. Additionally, a power level of the added noise can be increased when the near-end audio signal includes speech of the near-end user.
The above-described and other features and advantages of the invention are explained in detail hereinafter with reference to the illustrative examples shown in the accompanying drawings. Those of skill in the art will appreciate that the described embodiments are provided for purposes of illustration and understanding and that numerous equivalent embodiments are contemplated herein.