The invention relates generally to reducing unwanted audio or acoustic feedback in a communication system, and particularly to an adaptive acoustic echo cancellation device for suppressing acoustic feedback between the loudspeaker and microphone of a telephone unit in a teleconferencing system. The telephone unit of a typical audio conferencing system includes a loudspeaker for broadcasting an incoming telephone signal into an entire room. Similarly, the telephone's microphone is typically designed to pick up the voice of any person within the room and transmit the voice to a remote telephone at the far end of the communication system.
Unlike conventional hand held telephone sets, conference telephone units are prone to acoustic feedback between the loudspeaker unit and microphone. For example, a voice signal which is broadcast into the room by the loudspeaker unit may be picked up by the microphone and transmitted back over the telephone lines. As a result, persons at the far end of the communication system hear an echo of their voice. The echo lags the person's voice by the round trip delay time for the voice signal. Typically, the echo is more noticeable as the lag between the person's voice and the echo increases. Accordingly, it is particularly annoying in video conferencing systems which transmit both video and audio information over the same telephone lines. The additional time required to transmit video data increases the round trip delay of the audio signal, thereby extending the lag between a person's voice and the echo.
Many conference telephones avoid echo by allowing only half duplex communication (that is, by allowing communication over the phone line to occur in only one direction at a time) thereby preventing feedback. For example, when the loudspeaker unit is broadcasting a voice, the telephone disables the microphone to prevent the loudspeaker signal from being fed back by the microphone.
While a half duplex system avoids echo, it often cuts off a person's voice in mid-sentence. For example, when both parties speak simultaneously, the telephone unit allows communication in only one direction, thereby clipping the voice of one party.
Some loudspeaker telephones employ echo cancellation in an attempt to allow full-duplex communication without echo. Conventional echo cancellation devices attempt to remove from the microphone signal the component believed to represent the acoustic feedback. More specifically, these devices prepare an electric signal which simulates the acoustic feedback between the loudspeaker and the microphone. This electric signal is subtracted from the microphone signal in an attempt to remove the echo.
Electrically simulating the acoustic feedback is difficult since the acoustic feedback is determined by the acoustic characteristics of the room containing the microphone and speaker. This is complicated by variations in the acoustic characteristics of different rooms and by the dramatic changes in a given room's characteristics which occur if the microphone or loudspeaker is moved, or if objects are moved in the room.
To compensate for the changing characteristics of the room, many echo cancellation devices model the room's characteristics with an adaptive filter which adjusts with changes in the room. More specifically, the electric signal used to drive the telephone's loudspeaker is applied to a stochastic gradient least-means-squares adaptive filter whose tap weights are set to estimate the room's acoustic response. The output of the filter, believed to estimate the acoustic echo, is then subtracted from the microphone signal to eliminate the component of the microphone signal derived from acoustic feedback. The resultant "echo corrected" signal is then sent to listeners at the far end of the communication system.
To assure that the adaptive filter accurately estimates the room's response, the device monitors the echo corrected signal. During moments when no one is speaking into the microphone, the adaptive filter adjusts its tap weights such that the energy of the echo corrected signal is at a minimum. In theory, the energy of the echo corrected signal is minimized when the adaptive filter removes from the microphone signal an accurate replica of the acoustic feedback. However, the adaptive process must be disabled whenever a person speaks into the microphone. Otherwise, the unit will attempt to adjust the tap weights in an effort to eliminate the speech.
Accordingly, echo cancellation devices which employ adaptive filters for estimating a room's response typically include a "double-talk" detection device which monitors the microphone signal to determine when a person is speaking into the microphone. One such detector, described in D. L. Duttweiler, "A Twelve Channel Digital Echo Canceller", IEEE Trans. On Comm., Volcom-26, No. 5, May 1978, declares double talk when a sample of the microphone signal is greater than or equal to one-half the largest sample of the loudspeaker signal within the last N samples, where N is a constant equal to the maximum delay between the loudspeaker and the microphone. If someone is speaking into the microphone, the energy of the microphone signal is typically at least half that of the loudspeaker signal. Accordingly, the above described double talk detector properly concludes that someone is speaking into the microphone and disables the adaptive filter from adjusting its taps.
If the loudspeaker and microphone are far apart from each other, the microphone includes little or no acoustic feedback from the loudspeaker. Further when someone is speaking softly into the microphone, the energy of the soft voice component of the microphone signal is not alone greater than half the energy of loudspeaker signal. Accordingly, the above described doubletalk detector falsely concludes that no one is speaking into the microphone and therefore enables the adaptive filter to adjust its taps. The filter accordingly begins adjusting the taps in an effort to reduce the echo-corrected microphone signal to zero. Thus, by falsely concluding that no one is speaking into the microphone, the device begins to cut off the voice of the person speaking into the microphone.
If the loudspeaker is placed close to the microphone, the energy of the microphone signal may exceed half the energy of the loudspeaker signal regardless of whether someone is speaking into the microphone. For example, if the room includes ambient background noise such as generated by a fan, the microphone picks up this sound and adds it to the substantial acoustic feedback caused by the close proximity of the microphone and loudspeaker. Accordingly, the energy of the microphone signal may exceed the half of the energy of the loudspeaker signal even when the loudspeaker is the only source of speech in the room. In this case, the above described doubletalk detector falsely concludes that someone is always speaking into the microphone and therefore permanently disables the adaptive filter from adjusting its taps.
Therefore, one object of the present invention is to provide an acoustic echo cancellation device which includes an improved double talk detector for determining when someone is speaking into the microphone.