This application is a Continuation-In-Part of Applicant""s co-pending U.S. application Ser. No. 09/005,144, entitled xe2x80x9cMethods and Apparatus for Controlling Echo Suppression in Communications Systems,xe2x80x9d filed on Jan. 9, 1998, and incorporated herein by reference.
The present invention relates to echo suppression in bi-directional communications systems and, in particular, relates to minimizing clipping and distortion of desired-voice signals while providing satisfactory echo suppression.
In bi-directional communications, two or more systems transfer information, one system to the other. Thus, each system both receives and transmits information. In the context of voice conversation, remote parties engage in two-way conversation, with each party both sending and receiving signals representative of speech. Ideally, signals received by a party on the near-end contain only the speech (and noise) originated by the far-end party. Echo phenomena represent deviations from this ideal case and, with regard to voice transmissions, refer to one party undesirably receiving a delayed version of his or her own voice during the conversation.
In a more generalized sense, echo refers to a transmitted signal undesirably being reflected backxe2x80x94typically with a variable delayxe2x80x94in a received signal. In the context of voice communications, echo refers to one party receiving a delayed version of his or her own voice. Such echo may arise from network echo and/or acoustic echo. Impedance mismatches within the transmission network carrying the two-way signals can result in network echo. As an example, impedance mismatches in 4-to-2 wire hybrid interface s in central offices of the Public Switched Telephone Network may cause network echo. Acoustic coupling between an output loudspeaker and an associated input microphone is referred to as acoustic echo. Speakerphones, in-car phone systems, and other examples of xe2x80x9chands-freexe2x80x9d communication systems are primary examples of instances where acoustic echo may be problematic. In either instance, transmission delay within the communications system exacerbates the problem of echo. With increasing delay, the echo signal becomes increasingly displaced in time from the actual signal and, therefore, more noticeable. Satellite-based telephony is an example of a communications system having significant intrinsic delay, with round-trip signal delays of approximately of 520 milliseconds. Other types of communications systems have less obvious sources of delay. For example, wireless mobile communications systems employing digital encoding techniques, such as Global Services for Mobile systems (GSM) or TIA/EIA-136, have signal encoding delays on the order of 100 milliseconds. G.131, published by the International Telecommunication Unionxe2x80x94Telecommunication Standardization Section (ITU-T), recommends the use of echo control devices in communications systems having delays above 25 milliseconds. Consequently, network echo control measures are commonplace.
Echo control, absent proper design and use, can itself contribute to degraded communications quality. Echo cancellation represents a common echo control technique wherein a linear echo canceller (LEC) produces an estimated-echo signal based on processing an echo-causing signal. In other words, the linear echo canceller models the influence an echo-causing signal has on an echo-containing signal. By subtracting this estimated-echo signal from the echo-containing signal, the resultant echo-cancelled signal is ideally stripped of its echo component. System non-linearities, however, result in a residual echo component not cancelled by the linear estimated-echo signal. For example, pulse-based digital encoding algorithms introduce signal non-linearities. Loudspeaker audio distortion, resulting in differences between the echo-causing loudspeaker drive signal and the actual echo signal input to a microphone, represents an example of audio-based non-linearity.
One interesting prior-art approach to desired voice detection in the context of LEC control is disclosed in a master""s thesis written by S. G. Sankaran entitled, xe2x80x9cImplementation and Evaluation of Echo Cancellation Algorithms,xe2x80x9d published in December 1996 by Virginia Polytechnic Institute, Blacksburg, Va. This thesis describes a so-called xe2x80x9cItakura distance measurexe2x80x9d used to determine the presence of desired speech based on processing an echo-containing signal in conjunction with an echo-causing signal (pp. 40-45). However, this approach is problematic because the actual echo component in the echo-containing signal experiences spectral shaping due to the echo path and, thus, differs from the echo-causing signal. Plus, the echo-causing signal does not reflect the time shifting associated with the echo path. The Sankaran thesis also discloses a xe2x80x9cDouble-Talk Detection Statisticxe2x80x9d (DTDS) algorithm used to enable/disable adaptation of a linear echo canceller (pp. 46-48). This DTDS algorithm uses the echo-cancelled signal, an estimated-echo signal, and the echo-causing signal in determining whether to enable/disable adaptation of the linear echo canceller. However, the DTDS algorithm can be problematic as it does not consider background noise in the echo-cancelled signal in its calculations. The DTDS algorithm can also provide false detection of desired voice, like other prior-art systems, based on mistakenly identifying an increase in echo-cancelled signal energy arising from abrupt changes in the echo path as desired voice.
In general, because of residual echo problems resulting from system non-linearities or abrupt deviations in echo coupling, prior-art systems often combine a LEC with a non-linear processor (NLP) to form a xe2x80x9chybridxe2x80x9d echo suppresser. These hybrid echo suppressers pass the echo-cancelled signal through the NLP to further attenuate the echo-cancelled signal, including its residual-echo component. NLP-based echo suppression uses a non-linear process to block echo voice and pass desired voice. This operation usually involves operating the NLP in a pass-through mode or in a blocking mode, depending on detected conditions. When both users (the near- and far-end parties) talk simultaneouslyxe2x80x94referred to as xe2x80x9cdouble talkxe2x80x9dxe2x80x94existing NLPs sometimes operate in the blocking mode, which severely distorts or cuts out the desired voice. Existing hybrid echo suppressers often undesirably switch into pass-through mode when the echo-causing signal (e.g., loudspeaker output) has a step change in noise, such as when the echo path or background noise changes abruptly. This undesirable tendency to switch into pass-through mode often subjects the far-end party to undesirable echo. Because a NLP thus configured operates on the entire echo-cancelled signal, its operation must be carefully controlled or desired-voice signals (non-echo voice) may be undesirably clipped or attenuated.
As explained, muting operations of the NLP interfere with desired-voice signals if the NLP fails to quickly transition from its blocking mode to its pass-through mode. Prior-art hybrid echo suppressers process the echo-cancelled signal in conjunction with the estimated-echo signal to determine the start or continuation of the desired-voice signal to effect NLP mode control. Because updates in the estimated-echo signal lag current echo-path characteristics, sudden changes in the actual echo path and fast noise bursts result in a momentarily poor estimate of current echo. Thus, these prior-art approaches are subject to incorrectly setting the NLP mode based on changes in the echo-cancelled signal arising not from desired-voice, but rather arising from echo-path changes, noise bursts, and other transients influences that change echo components in the echo-containing signal faster than may be adapted to by the LEC.
The present invention includes methods and apparatus for minimizing echo in a bi-directional communications system. While elimination of echo is unquestionably desirable, how echo is controlled significantly influences the quality of transmitted speech. For example, conventional NLP-based echo suppressers may operate obtrusively and undesirably clip or suppress desired speech signals.
The improved hybrid echo suppresser of the present invention includes a LEC that helps in subtracting estimated echo from an echo-containing signal and a NLP for removing residual echo from the resultant echo-cancelled signal. A logic circuit controls operation of the NLP such that it provides required levels of echo suppression while avoiding the disruption of desired speech signals based on quickly detecting the start of desired voice. The logic circuit repeatedly compares the echo-containing signal with the estimated-echo signal, which is derived from an echo-causing signal. In this manner, the logic circuit reliably transitions the NLP from its echo-voice (blocking) mode to its desired-voice (pass-through) mode upon detecting the start of desired voice. Moreover, the logic circuit reliably maintains the NLP in its pass-through mode based on continuing detection of desired voice in the echo-containing signal. Thus, the hybrid echo suppresser of the present invention avoids letting the echo through even in the presence of echo-path changes and avoids disrupting the desired-voice signal during periods of double-talk.
In an exemplary embodiment, an energy-based comparison allows the logic circuit to determine whether the echo-containing signal includes a desired-voice component. If not, the NLP operates in an echo-voice mode with significant signal attenuation, thereby ensuring sufficient echo-suppression. However, upon start-of-desired voice detection, the logic circuit switches from echo-voice mode to a desired-voice mode, which has significantly less signal attenuation, thereby avoiding significant clipping or disruption of the desired voice. Thereafter, continued detection of desired voice ensures that the logic circuit does not inadvertently switch back to echo-voice mode, such as during periods of double-talk. Thus, during periods of double-talk (when there is both echo voice and desired voice), the logic circuit maintains the NLP in desired-voice mode.
The improved hybrid echo suppresser of the present invention provides performance advantages in comparison to prior-art hybrid echo suppressers. Reliable detection of desired voice allows the improved hybrid echo suppresser of the present invention to appropriately control its NLP in providing effective echo suppression, while avoiding distortion or clipping of desired voice.