A teleconference is the live exchange of information between several persons remote from one another but linked by a telecommunication system. The telecommunications system may support the teleconference by providing one or more of: audio, video, and/or data services. Therefore, the term teleconference is taken to include videoconferences, web conferences, and other forms of mixed media conferences, as well as purely audio conferences.
One of the main problems of achieving a good quality experience in a teleconference is the need to eliminate audio feedback or echo caused by a speaker's own speech being played back to them by the teleconferencing service. Until recently most algorithms worked on the assumption that the only possible path for audio to get from one participant's microphone to another participant's microphone was through being sent to the teleconferencing server and back again (typically with a delay of more than 100-200 milliseconds).
In recent times, however, with cheap network links and computer telephony, it is common for many conference participants to be physically adjacent to each other in a meeting room, but to have separate lines open to the teleconferencing server. In such a situation, it is possible for the person speaking to be picked up by several different microphones. Since each teleconference participant in the same room will also have a speaker playing the sound of the teleconference, the number of potential feedback loops will increase dramatically with each active microphone in the room, which makes good echo cancellation very difficult to achieve.
Current echo cancellation is based upon detecting when the received signal from a microphone contains duplicate copies of the main speech signal which are attenuated and offset by a delay. As there are multiple possible causes of echo, the algorithms deal with the possibility of having multiple different echoes with different delays. The process of detecting and eliminating these echoes is never perfect and risks introducing significant distortion into the speech signal.