Echo is typically introduced by phone terminals operating in speakerphone mode or by a hybrid that converts a 2-wire analog circuit to 4-wire transmission lines in public switched telephone network (PSTN) networks. In an IP network, the echo (acoustic and/or hybrid) is carried through from the terminals and is subject to variable delays and jitter. In an IP conferencing system, echo introduced by any of the participants is heard by all the participants, other than the terminal(s) introducing the echo, leading to poor quality of the audio conference. Monitoring and removal of echo from IP audio streams is a significantly expensive operation from a media processing resource utilization perspective.
IP based conference servers are typically referred to as IP media servers that are employed in telephony networks and perform a variety of basic and enhanced services, which include conferencing, audio and video interactive voice response (IVR), transcoding, audio and video announcements, and other advanced speech services. IP media servers may also be employed in networks that provide video conferencing services, as well as typical data exchange services of the sort that occurs over the internet, over virtual private networks, within wide area networks and local area networks, and the like. Data exchange and processing performed by the media server is based on packet processing with fixed maximum processing time requirements.
IP multimedia conferencing servers allow a number of participants to join a conference. The conference service provides for the mixing of participants' media by a mixer resource, allowing all participants to hear or see other participants as they become active during the conference. The conference mixer resource may use media from all participants to determine which participants will be heard or seen during conference operation as active participants. The set of active participants can dynamically change in real time as a given participant stops contributing while another participant starts contributing.
A single instance of a conferencing service may be distributed over N processors, where N>=1. A set of media processing servers may be collocated within the same physical server or may be distributed over a number of physical servers inter-connected via IP communications interfaces over near or far locations.
Regardless of the conference mixer resources being collocated or distributed, the user experience of the services and participant interaction in the conference preferably should not be altered. For instance, in an audio conference, all participants, regardless of the conference mixer resources being geographically distributed or collocated, should hear the same conference output mix.
IP multimedia peer-to-peer servers allow two participants to participate in a two-way conference.