With the wide-spread use of speakerphones and teleconferencing, acoustic echo cancellation has become increasingly important. In particular, an acoustic echo canceller (AEC) aims to reduce or eliminate undesired echoes. An undesired echo is generated when the loudspeaker signal feeds back into the microphone, usually by a direct path due to loudspeaker-microphone coupling and by an indirect path due to acoustic reflections of the loudspeaker signal on objects and walls. For example, in voice communications over telephone or the Internet, a speaker will hear a delayed and filtered version of his own voice, if acoustic echo is not sufficiently reduced by the terminal associated with the other party to the telephone communication.
For an effective application of an AEC in a terminal, the sampling rates in a digital-to-analog (D/A) convertor that reconstructs the analog signal to be sent to the loudspeaker and an analog-to-digital (A/D) convertor that samples the speech signal picked up by the microphone should match exactly. It has been found that even a small clock skew between the sampling rates can significantly degrade the performance of an AEC. Generally, the reliability of the AEC degrades as the sampling rate offset between the loudspeaker and microphone signals increases. Clock skew in the sampling rates of a terminal is a frequent problem, for example, in PC-based software terminals. Sampling rate skew can always be assumed, if, e.g., an external USB camera is used with its built-in microphone and A/D convertor for audio recording in conjunction with a separate soundboard for audio playback. In this case, the A/D and D/A convertors do not derive their clock from a common reference (quartz clock), and therefore are not synchronized.
Clock skew compensation methods exist for deployment in terminals. Such terminal-based methods typically make use of read and write pointer locations in buffers associated with the D/A and A/D convertors. For example, when the receive (RX) buffer read-pointer increments faster than the transmit (TX) buffer write-pointer, the sampling rate of the D/A convertor is greater than the sampling rate of the A/D convertor. Therefore, the difference between the read-pointer increments per specified time for the receive buffer and the write-pointer increments for the transmit buffer can be used to estimate the clock offset or sampling rate offset. The resulting offset can then be used to control a re-sampling rate of one of the signals in order to achieve the same sampling rate for the loudspeaker and microphone signals. See, for example, M. Pawig and G. Enzner, “Adaptive Sampling Rate Correction for Acoustic Echo Control in Voice-Over-IP,” IEEE Trans. on Signal Processing, Vol. 58, No. 1 (January 2010); or D. Miljkovic et al., “Clock Skew Compensation by Speech Interpolation,” IEEE Int'l Conf. on Digital Telecommunications (2006), each incorporated by reference herein.
When acoustic echo is insufficiently suppressed or cancelled in a terminal, acoustic echo cancellation can be attempted remotely in the network. A number of technical problems exist, however, for network-based acoustic echo cancellation. For example, the above-described terminal-based clock skew compensation technique cannot be applied in the network, since the read-pointer and write-pointer positions are not accessible from a remote location such as the network.
Thus, existing network services provide only acoustic echo suppression (AES). The perceptual performance of an acoustic echo suppressor, however, is significantly inferior to the perceptual performance of an AEC. A particular drawback of AES is the lack of transparency in a call, apparent when both ends attempt to talk simultaneously. In its rudimentary form, an AES allows only one end to talk (similar to a half-duplex communication mode) by inserting a loss in one signal path. Even though enhancements such as comfort noise insertion can improve the perceived communication quality, the performance of an acoustic echo suppressor is still significantly inferior to the performance of a true AEC.
A need therefore exists for improved techniques for compensating for clock skew arising in a terminal, to allow for an effective application of an AEC. A further need exists for clock skew compensation techniques that can be employed in a terminal or in the network.