The present invention relates to communications systems, and more particularly, to echo suppression in a bidirectional communications link.
In many communications systems, for example landline and wireless telephone systems, voice signals are often transmitted between two system users via a bi-directional communications link. In such systems, speech of a near-end user is typically detected by a near-end microphone at one end of the communications link and then transmitted over the link to a far-end loudspeaker for reproduction and presentation to a far-end user. Conversely, speech of the far-end user is detected by a far-end microphone and then transmitted via the communications link to a near-end loudspeaker for reproduction and presentation to the near-end user. At either end of the communications link, loudspeaker output detected by a proximate microphone may be inadvertently transmitted back over the communications link, resulting in what may be unacceptably disruptive feedback, or echo, from a user perspective.
Therefore, in order to avoid transmission of such undesirable echo signals, the microphone acoustic input should be isolated from loudspeaker output as much as possible. With a conventional telephone handset, in which the handset microphone is situated close to the user's mouth while the handset speaker essentially covers the user's ear, the requisite isolation is easily achieved. However, as the physical size of portable telephones has decreased, and as hands-free speaker-phones have become more popular, manufacturers have moved toward designs in which the acoustic path from the loudspeaker to the microphone is not blocked by the user's head or body. As a result, the need for more sophisticated echo suppression techniques has become paramount in modern systems.
The need is particularly pronounced in the case of hands-free automobile telephones, where the closed vehicular environment can cause multiple reflections of a loudspeaker signal to be coupled back to a high-gain hands-free microphone. Movement of the user in the vehicle and changes in the relative directions and strengths of the echo signals, for example as windows are opened and closed or as the user moves his head while driving, further complicate the task of echo suppression in the automobile environment. Additionally, more recently developed digital telephones process speech signals through voice encoders which introduce significant signal delays and create non-linear signal distortions. Such prolonged delays tend to magnify the problem of signal echo from a user perspective, and the additional non-linear distortions make echo suppression by the network equipment more difficult.
In response to the above described challenges, telephone manufacturers have developed a wide variety of echo suppression mechanisms. An exemplary echo suppression system 100 is depicted in FIG. 1A. As shown, the exemplary system 100 includes a microphone 110, a loudspeaker 120 and an echo suppressor 130. An audio output 115 of the microphone 110 is coupled to an audio input of the echo suppressor 130, and an audio output 135 of the echo suppressor 130 serves as a near-end audio input to a telephone (not shown). Additionally, a far-end audio output 125 from the telephone is coupled to an audio input of the loudspeaker 120 and to a reference input of the echo suppressor 130.
In operation, the echo suppressor 130 processes the microphone signal 115 to provide the audio output signal 135 to a far-end telephone user. More specifically, the echo suppressor 130 attenuates the microphone signal 115, in dependence upon the far-end audio signal 125, so that acoustic echo from the loudspeaker 120 to the microphone 110 is not passed back to the far-end telephone user.
Typically, the echo suppressor 130 is either a non-linear, clipping type suppressor or a linear, scaling type suppressor. Clipping type suppressors generally attenuate the microphone output signal 115 by removing a portion of the signal falling within a particular range of values (i.e., within a particular clipping window). Scaling type suppressors, on the other hand, attenuate the microphone output signal 115 by multiplying the signal with an appropriate scale factor. Recently developed hybrid suppressors incorporate both clipping and scaling aspects, for example by scaling a portion of the microphone signal falling within a particular attenuation window. In any case, the level of attenuation (i.e., the clipping window and/or the scale factor) is generally adjusted, either directly or indirectly, in accordance with the amplitude of the far-end audio signal 125 so that the microphone output 115 is attenuated only to the extent the far-end user is speaking.
A conventional clipping type suppressor, known in the art as a center clipper, is described for example in U.S. Pat. No. 5,475,731, entitled "Echo-Canceling System and Method Using Echo Estimate to Modify Error Signal" and issued Dec. 12, 1995 to Rasmusson et al. An alternative clipping type suppressor, known as an AC-Center clipper, is described in copending U.S. patent application Ser. No. 08/775,797, entitled "An AC-Center Clipper for Noise and Echo Suppression in a Communications System" and filed Dec. 31, 1996. An exemplary scaling type suppressor is described in U.S. Pat. No. 5,283,784, entitled "Echo Canceller Processing Techniques and Processing" and issued Feb. 1, 1994 to Genter. An advanced hybrid suppressor, referred to herein as an AC-center attenuator, is described in copending U.S. patent application Ser. No. 09/005,149, entitled "Methods and Apparatus for Improved Echo Suppression in Communications Systems" and filed on even date herewith. Advanced control of these and other clipping, scaling and hybrid type suppressors is described in copending U.S. patent application Ser. No. 09/005,144, entitled "Methods and Apparatus for Controlling Echo Suppression in Communications Systems" and filed on even date herewith. Each of the above identified patents, as well as each of the above identified copending patent applications, is incorporated herein in its entirety by reference.
The echo suppressor 130 of FIG. 1A can also be combined with a linear echo canceler to provide a more sophisticated echo suppression system. FIG. 1B depicts an exemplary system 101 including the microphone 110, the loudspeaker 120 and the echo suppressor 130 of FIG. 1A, and an acoustic echo canceler 140. As shown, the audio output 115 of the microphone 110 is coupled to an audio input of the acoustic echo canceler 140, and control and audio outputs 144, 145 of the acoustic echo canceler 140 are coupled to control and audio inputs of the echo suppressor 130, respectively. The audio output 135 of the echo suppressor 130 serves as the near-end audio input to the telephone (not shown), and the far-end audio output 125 from the telephone is coupled to the audio input of the loudspeaker 120 and to reference inputs of the acoustic echo canceler 140 and the echo suppressor 130.
In operation, the acoustic echo canceler 140 dynamically models the acoustic path from the loudspeaker 120 to the microphone 110 and attempts to cancel, from the microphone output signal 115, any loudspeaker sound that is picked up by the microphone 110. Algorithms commonly used for modeling the acoustic echo path include the well known Least Mean Squares (LMS) algorithm and variants such as Normalized Least Mean Squares (NLMS). An exemplary Least Mean Squares based canceler is described in the above cited U.S. Pat. No. 5,475,731 to Rasmusson et al. Additionally, an advanced Normalized Least Mean Squares based canceler is described in copending U.S. patent application Ser. No. 08/852,729, entitled "An Improved Echo Canceler for use in Communications Systems" and filed May 7, 1997, which is incorporated herein in its entirety by reference.
The control output, or control metric 144 indicates the instantaneous level of cancelation achieved by the acoustic echo canceler 140 and is used by the echo suppressor 130 to determine the level of additional attenuation needed to suppress any residual echo component to a particular goal level. As in the system 100 of FIG. 1A, the echo suppressor 130 can be a clipping suppressor, a scaling suppressor or a hybrid suppressor. The control metric 144 is thus adjusted accordingly as described for example in the above cited patents and patent applications. Additionally, the echo suppressor 130 can, when following the echo canceler 140, be a simple switch which selectively mutes the audio output 135 at appropriate times (e.g., during periods in which a near-end voice activity detector indicates that the microphone signal 115 contains no near-end speech).
Note that in both of the exemplary systems 100, 101 of FIGS. 1A and 1B, the echo suppressor 130 attenuates the entire audio signal. Thus, in addition to attenuating the echo, the echo suppressor 130 also attenuates any background noise and/or near-end speech which may be present. In fact, the background noise can be suppressed to the point that the far-end user may erroneously believe that the call has been disconnected when the echo suppressor 130 is active. Therefore, to improve the quality of communication for the far-end user, today's systems often add comfort noise to the telephone audio signal 135 when the echo suppressor 130 is active.
For example, some systems replace muted audio signals with white noise produced by a pseudo-random number generator (PRNG), wherein a variance of the noise samples is set based on an estimate of the energy in the actual background noise. Additionally, the above cited U.S. Pat. No. 5,283,784 to Genter describes a similar approach in which white noise samples are band-limited to the telephone system bandwidth and stored in a read only memory (ROM) table. Comfort noise is then generated as needed by selecting samples from the table. Yet another solution is described in U.S. patent application Ser. No. 08/375,144, entitled "Method of and Apparatus for Echo Reduction in a Hands-Free Cellular Radio Communication System" and filed Jan. 19, 1995, which is incorporated herein in its entirety by reference. There a block of samples of actual background noise is stored in memory, and comfort noise is generated by outputting segments of successively stored samples beginning with random starting points within the block.
While the above described systems provide certain advantages, none provides comfort noise which closely and consistently matches the actual environment noise in terms of both spectral content and magnitude. For example, the spectral content of comfort noise produced by generating white noise samples is, by definition, uniform across the audible frequency band, while automobile background noise is typically biased toward the low end of the band. Also, since the degree of spectral tilt varies from car to car and depends on prevailing driving conditions, storing an exemplary tilted spectrum in ROM is insufficient. Further, comfort noise generated by repeatedly outputting segments of actual noise samples includes a significant periodic component and therefore often sounds as if it includes a distorted added tone.
Thus, with conventional noise generation techniques, the far-end user perceives continual changes in the character and content of the transmitted background noise, as comfort noise is selectively added or substituted only when the echo suppressor 130 is active. Such changes in the perceived background noise can be annoying or even intolerable. For example, with the relatively long delay in today's digital cellular phones, differences between actual background noise and modeled comfort noise are often perceived as whisper echoes. Consequently, there is a need for improved methods and apparatus for generating comfort noise in echo suppression systems.