Historically, telecommunications have involved the transmission of voice and fax signals over a network dedicated to telecommunications, such as the Public Switched Telephone Network (PSTN) or Private Branch Exchange (PBX). Similarly, data communications between computers have been historically transmitted on a dedicated data network, such as a Local Area Network (LAN) or a Wide Area Network (WAN). Currently, telecommunications and data transmissions are being merged into an integrated communication network using technology such as Voice over Internet Protocol (VoIP). Since many LANs and WANs transmit computer data using Internet Protocol (IP), VoIP uses this existing technology to transmit voice and fax signals by converting these signals into digital data and encapsulating the data for transmission over an IP network. Traditional communication networks often support multipoint conferences between a number of participants using different communication devices. A Multipoint Control Unit (MCU) is used to couple these devices, which allows users from distributed geographic locations to participate in the conference. The conference may be audio only (e.g. teleconference), or video conferencing/broadcasting may be included. A single MCU may be used to accommodate thousands of participants in a multipoint conference.
When supporting three or more endpoints, MCUs typically support one of two layout formats; (i) continuous presence; or (ii) voice-activated. The MCU creates a continuous presence format by tiling together a scaled-down version of video streams from some or all endpoints into a grid that is displayed at some or all endpoints. However, it is often desirable to view a participant in full screen mode, in which case voice-activated switching (VAS) is used.
MCUs in voice-activated switching mode may send all participants a copy of a full-resolution video screen from the participant who is speaking the loudest at any given time. In this embodiment, the loudest speaker never sees him/herself. In order to prevent spurious switching, the MCU typically implements a hysteresis algorithm which only switches to a new speaker after that speaker has been speaking for at least a certain duration of time (e.g., one or two seconds). Accordingly, a hysteresis delay is introduced into the system, which requires that video is switched to a new speaker only after that speaker has been talking for some fixed period of time (e.g., one or two seconds).
An MCU which hosts an interactive videoconference can also multicast a copy of the conference to many receive-only viewers. Such a copy may also be recorded to a disk server by using a highly scalable multi-task streaming mechanism.