Acoustic echo is a common problem in full duplex audio systems, such as audio conferencing or videoconferencing systems. Acoustic echo occurs when the far-end speech sent over a network comes out from the near-end loudspeaker, feeds back into a nearby microphone, and then travels back to the originating site. Talkers at the far-end location can hear their own voices coming back slightly after they have just spoken, which is undesirable.
To reduce this type of echo, audio systems can use an acoustic echo cancellation technique to remove the audio from the loudspeaker that has coupled back to the microphone and the far-end. Acoustic echo cancellers employ adaptive filtering techniques to model the impulse response of the conference room in order to reproduce the echoes from the loudspeaker signal. The estimated echoes are then subtracted from the out-going microphone signals to prevent these echoes from going back to the far-end.
In some situations, the microphone and loudspeakers use converters with different clocks. For example, the microphone captures an analog waveform, and an analog-to-digital (A/D) converter converts the analog waveform into a digital signal. Likewise, the loudspeaker receives a digital signal, and a digital-to-analog (D/A) converter converts the digital signal to an analog waveform.
The conversions performed by the converters use a constant sampling rate provided by a crystal that generates a stable and fixed frequency clock signal. When the converters are driven by a single clock, the converters can produce the same number of samples as one another. However, the converters may be driven by separate clocks with different levels of performance, frequency, stability, accuracy, etc. Thus, the two convertors may perform their conversions at slightly different rates. Accordingly, the number of samples produced over time by the A/D convertor will not match the number of samples consumed in the same period of time by the D/A convertor. This differences becomes more pronounced over time.
For good acoustic echo cancellation, the loudspeaker's clock and the microphone's clock are preferably at the same frequency to within a few parts per million (PPM). In a desktop computer or wireless application, the loudspeaker and microphone clocks are typically controlled by physically separate crystals so that their frequencies may be off by 100 PPM or more. Dealing with the discrepancy presents a number of difficulties when performing acoustic echo cancellation. Moreover, attempting to measure this frequency difference by using the audio present on the loudspeaker and microphone channels can be difficult as well.
One prior art technique disclosed in U.S. Pat. No. 7,120,259 to Ballantyne et al. performs adaptive estimation and compensation of clock drift in acoustic echo cancellers. This prior art technique examines buffer lengths of the loudspeaker and microphone paths and tries to maintain equivalent buffer lengths to within one sample. However, accuracies greater than a sample may be necessary for good acoustic echo cancellation.
The subject matter of the present disclosure is directed to overcoming, or at least reducing the effects of, one or more of the problems set forth above.