Traditional multi-party conferencing systems generally employ one of two common multi-party communications techniques: (1) IP-layer/application-layer multicast, or (2) centralized audio mixing using H.323 multi-point control units (MCUs). In the first of these techniques, i.e., the multicast approach, the system distributes multiple audio streams concurrently from all active speakers to all participants. Although multicast is well suited for broadcast applications that usually involve one active speaker, it becomes inefficient for interactive and spontaneous applications (e.g., on-line gaming) that often include many simultaneous speakers. A multicast system can become overloaded by processing many audio streams concurrently. Moreover, separate multicast trees must be maintained for all participants at all times since it is not possible to predict which participants will become speakers as time progresses.
The second technique, i.e., the audio mixing scheme, can effectively reduce the number of concurrent streams because it first mixes the audio streams of all active speakers into a single stream and then distributes the mixed stream to all participants. However, the centralized, server-based audio mixing processing (e.g., the processing of the MCUs) cannot achieve the desired scalability and cost-effectiveness in peer-to-peer environments where the multi-party VoIP service is most applicable. Current distributed audio mixing systems use a Coupled Distributed Processing (CDP) approach that uses the same tree for both stream mixing and distribution. However, multi-party VoIP services usually present asymmetric properties: (1) the number of active speakers (i.e., stream sources) is different from the number of listeners (e.g., stream receivers), and (2) the in-bound bandwidth of a processing node is often different from its out-bound bandwidth. The asymmetrical bandwidth of a processing node is found in many Internet connections, such as, for example, in broadband over cable networks and Digital Subscriber Lines (DSL). As an example, a system may determine an optimal mixing tree for a given network topology to communicate audio source signals from all of the “speaking” nodes to a central mixing node. This optimal “mixing” tree, however, may not also correspond to the best distribution tree for communicating audio source signals from that mixing node to the all of the participating nodes that are to receive the composite audio source. This asymmetry makes the CDP approach sub-optimal due to its using the same tree for both mixing and distribution.
Therefore a need exists to overcome the problems with the prior art as discussed above.