The present invention relates to the field of echo suppression during bi-directional communications, and more particularly to a method of variable gain echo suppression where the gain varies based on a characteristic of the residual echo after initial echo cancellation.
Several approaches have been proposed that attempt to regulate echo during bi-directional communications, particularly bi-directional communication using wireless communications devices, such as mobile terminals, which may be subject to rapidly changing acoustic environments. Typically, prior methods selectively engage an echo suppressor depending on a variety of conditions, such as when an incoming signal includes echo-causing voice. For instance, when the incoming signal does not include echo-causing voice, the echo suppressor is bypassed, or the gain of the echo suppressor may be set to one. On the other hand, when the incoming signal includes echo-causing voice (or echo-causing voice and noise), further analysis is employed to determine whether there is single-talk or double-talk. Single-talk arises when the echo-causing voice is present, but not desired voice (e.g., only the remote user is talking in the context of acoustic echo suppression); in this situation, the gain of the echo suppressor is set low so as to significantly attenuate the otherwise present echo in the outgoing signal. Double-talk arises when both echo-causing voice and desired voice are present (e.g., both the local user and the remote user are talking simultaneously); in this situation, the gain of the echo suppressor is set to an intermediate level to attenuate the potential echo signal, but not eliminate the desired voice from the outgoing signal. Thus, the echo suppressor is controlled differently depending on whether echo-causing voice, desired voice, or both are present. One difficulty in such approaches is in having the communications devices quickly and accurately determine which condition applies at any given moment in time. Stated rather simplistically, it is very difficult for communications devices to determine who is doing the talkingxe2x80x94the local user, the remote user(s), or bothxe2x80x94at any given time and to rapidly and accurately switch between the corresponding modes, especially in the presence of rapidly changing background noise and/or a rapidly changing echo path. Typically, this mode selection task involves a so-called desired-voice detector, which is necessarily complex.
The present invention obviates the need to differentiate between echo single-talk and double-talk (desired voice) situations, thereby obviating the need for a desired-voice detector. The approach of the present invention focuses on controlling the gain of the residual echo suppressor based on the estimated energy of the residual voice echo, or alternatively the entire residual echo, preferably on a per-frame basis. In some embodiments, this residual voice echo energy is compared against the estimated non-echo energy to determine the required amount of gain to apply so as to attenuate the residual voice echo below a psychoacoustic perception level. In some optional embodiments, comfort noise is added to the output signal from the residual echo suppressor in an amount that corresponds to the amount of signal energy loss through the residual echo suppressor. Thus, in some embodiments, desired voice and background noise (including local background noise and comfort noise) are used to mask the presence of residual echo.