Audio conferencing systems typically have a speaker, for playing audio generated by a far end audio signal source, and one or more microphones for capturing audio information generated by a near end or local audio signal source. As a consequence of the proximity of the speaker to one or more microphones in the system, at least some of the acoustic signal energy in the far end audio played by the speaker can be picked up by the microphones and sent back to the far end where it can be played and heard as acoustic echo. This acoustic echo can be very disruptive during the course of a conversation, as speakers at the far ends may have to wait for the echo to subside before speaking again.
In order to mitigate the disruptive effects of acoustic echo in an audio conferencing system, acoustic echo cancellation (AEC) arrangements exist that have the effect of removing a large portion of the acoustic echo component in the local microphone signal before it is sent to the far end. FIG. 1 shows a typical prior art audio conferencing system 10 having a loudspeaker 12 and a microphone 13 both of which are hard wired to the system. The system 10 also includes a digital signal processor (14) that includes, among other things, an adaptive filter 16 and a summation function 15. Generally, the audio conferencing system 10 operates as follows to cancel acoustic echo. An acoustic signal 11 generated by a far end (F.E.) audio source is received by the local audio conferencing system 10 (near end) which sends it to a loudspeaker 12 for reproduction. The far end acoustic signal is also sent to a DSP 14 that includes an adaptive filter 16 which is programmed to calculate an estimate of room echo (reference signal) 17. Typically, some energy from the acoustic signal 11 reproduced by the loudspeaker 12 is picked up by a microphone 13 (along with any local acoustic signal) and is sent to a summation function 15 operating in the DSP which subtracts the calculated estimate of the room echo (reference signal) 17 from a microphone signal 18 to product an echo cancelled signal 19 that is sent to the far end. This echo cancelled signal 19 is also sent to the DSP 14 which uses it to train the adaptive filter 16.
In order for the summation function to cancel the acoustic echo, both the reference signal and the microphone signal that includes the echo to be cancelled are processed by the summation function 15 at substantially the same time (aligned), otherwise some or all of the local room echo will not be cancelled. There is a round-trip delay from the time when the F.E. signal 11 is reproduced by the loudspeaker 12, the acoustic echo is picked up by the microphone (along with the local acoustic signal) and the acoustic signal 18 is sent to the DSP 14. This delay can be determined using empirical methods and can be programmed into the system 10 and is used to determine when in time the reference signal 17 is subtracted from the microphone signal 18.
As all of the signal processing typically takes place in a single DSP, DSP 14 in this case, the timing associated with the subtraction of the reference signal from the microphone signal cancelling the acoustic echo can be easily controlled. Specifically, since sampling of the F.E. acoustic signal 11 and the local microphone signal 18 takes place in a single device, DSP 14, the timing relationships between these signals are known.