A full mesh peer-to-peer topology in a video conference is achieved by setting up independent audio/video real-time RTP streams between each participant of the conference such that each participant transmits one audio/video stream to each other participant and receives the same from each other participant. The main advantage of a full mesh conference by way of comparison to the more traditional centralized bridge conference method is the lower latency of media and the elimination of bottlenecks in the form of centralized media servers. Mesh conferencing also is more cost efficient in terms of cloud resources. On the other hand, full mesh peer-to-peer topology cannot be scaled beyond a certain number of participants per session due to bandwidth limitations. In such case a bridge topology where media is sent to a centralized media server is more efficient and scalable.
In case of multi-party conference call in meshed mode all the participants send their media to each other directly. If this meshed call is escalated to bridged mode (due to legacy endpoints joining the conference or exceeding the maximum number of participants in meshed mode) then all the participants in existing conference call are forced to join the conference call on an Audio/Video Bridge such as a multipoint control unit (MCU). This transition from a meshed call to a bridged call creates a disruption in the already running conference. This escalation or transition sometimes takes more than an expected length of time, producing a blackout period in the conference call. Thus, users experience discontinuity in the video and audio streams of their conference call when escalation happens.
Since the network and device capability are changing rapidly, the criteria to decide a mesh call to a bridge call can be very dynamic. Currently the criteria are mainly the number of participants. With the increase in the network bandwidth, and the use of mobile endpoints and standards like WEBRTC (“Web Real-Time Communication”) there are other factors that can affect the user experience in a meshed-based conference call. This can create very bad experiences for users if their device is not capable of supporting the number of streams needed for the conference or the network conditions are not good.