Video conferencing services, such as iChat, Google+, and Skype, have been getting increasingly popular on the Internet, which require high-bandwidth and low-delay video transmission between distributed users. More recently, with the proliferation of video-camera consumer devices and penetration of high-speed access networks, more and more communications between multiple users have triggered the popularity of multi-party functionality in video conferencing, which involves multiple participants in a same video session to facilitate realtime group interactions. Considering the quadratically-increased workload with the number of users in a multi-party conferencing system, how to make high quality video delivery to maximize system-wide user quality-of-experience (QoE) has become a very challenging problem.
A current commercially employed solution for a multi-party conferencing system is a multipoint control unit (MCU)-based approach. In this approach, each participant needs to send its video to the MCU first and the MCU may do the mixing function, and then send the video to the receivers along separate connections, as shown in FIG. 1. This approach is simple and easy to maintain user states. However, due to detouring to the MCU and the handling burden at the MCU, this process may cause large delays, a single point of failure, and a bottleneck for the entire communication session. For instance, if the location of the MCU is not set up well, the delay performance between end users could not be guaranteed. Since user QoE degrades significantly if a one-way end-to-end video delay goes over 300 milli-seconds, the results of this solution are still far from satisfactory for the realtime communication.
There is a need for multi-party video conferencing to deliver high-quality video telephony that can support a high data rate while guaranteeing a small bounded end-to-end delay.