1. Field of the Invention
The invention relates to conferencing systems.
2. Description of the Related Art
As every day applications and services migrate to Internet Protocol (IP) networks at a remarkable rate, and with the growth of the variety of multimedia conferencing equipment, more and more people use multimedia conferencing as their communication tool. Today multimedia conferencing communication can be carried using a plurality of conferencing techniques. Following are few examples of conferencing techniques: the AVC multimedia conferencing method and the media relay conferencing method. AVC stands for Advanced Video Coding. In this disclosure, the terms: multimedia conference, video conference (with or without content) and audio conference may be used interchangeably and the term video conference can be used as a representative term of them.
Usually an AVC multipoint conference between three or more participants requires a AVC Multipoint Control Unit (MCU). An AVC MCU is a conference controlling entity that is typically located in a node of a network or in a terminal which receives several channels from a plurality of endpoints. According to certain criteria, the AVC MCU processes audio and visual signals and distributes them to each of the participating endpoints via a set of connected channels. Examples of AVC MCUs include the RMX® 2000, which are available from Polycom, Inc. (RMX is a registered trademark of Polycom, Inc.) A terminal in the AVC-communication method, which may be referred to as a AVC endpoint (AVCEP), is an entity on the network, capable of providing real-time, two-way audio and/or audio visual communication with another AVCEP or with the MCU. A more thorough definition of an AVCEP and an MCU can be found in the International Telecommunication Union (“ITU”) standards, such as but not limited to the H.320, H.324, and H.323 standards, which can be found at the ITU Website: www.itu.int.
A common MCU, referred to also as a AVC MCU, may include a plurality of audio and video decoders, encoders, and media combiners (audio mixers and/or video image builders). The MCU may use a large amount of processing power to handle audio and video communication between a variable number of participants (AVCEPs). The communication can be based on a variety of communication protocols and compression standards and may involve different types of AVCEPs. The MCU may need to combine a plurality of input audio or video streams into at least one single output stream of audio or video, respectively, that is compatible with the properties of at least one conferee's AVCEP to which the output stream is being sent. The compressed audio streams received from the endpoints are decoded and can be analyzed to determine which audio streams will be selected for mixing into the single audio stream of the conference. Along the present disclosure, the terms decode and decompress can be used interchangeably.
A conference may have one or more video output streams wherein each output stream is associated with a layout. A layout defines the appearance of a conference on a display of one or more conferees that receive the stream. A layout may be divided into one or more segments where each segment may be associated with a video input stream that is sent by a certain conferee via his/her AVCEP. Each output stream may be constructed of several input streams, resulting in a continuous presence (CP) image. In a CP conference, a user at a remote terminal can observe, simultaneously, several other participants in the conference. Each participant may be displayed in a segment of the layout, where each segment may be the same size or a different size. The choice of the participants displayed and associated with the segments of the layout may vary among different conferees that participate in the same session.
The second type of communication method is Media Relay Conferencing (MRC). In MRC, a Media Relay MCU (MRM) receives one or more streams from each participating Media Relay Endpoint (MRE). The MRM relays to each participating endpoint a set of multiple media streams received from other endpoints in the conference. Each receiving endpoint uses the multiple streams to generate the video CP image, according to a layout, as well as mixed audio of the conference. The CP video image and the mixed audio are played to MRE's user. An MRE can be a terminal of a conferee in the session which has the ability to receive relayed media from an MRM and deliver compressed media according to instructions from an MRM. A reader who wishes to learn more about an example of an MRC, MRM or an MRE is invited to read the U.S. Pat. No. 8,228,363, which is incorporated herein by reference. In the following, the term endpoint may represent also an MRE.
In some MRC systems, a transmitting MRE sends its video image in two or more streams; each stream can be associated with different quality level. The qualities may differ in frame rate, resolution and/or signal to noise ratio (SNR), etc. In a similar way each transmitting MRE may sends its audio in two or more streams that may differ from each other by the compressing bit rate, for example. Such a system can use the plurality of streams to provide different segment sizes in the layouts, different resolution used by each receiving endpoint, etc. Further, the plurality of streams can be used for overcoming packet loss.
Today, MRC becomes more and more popular. Many video conferencing systems deliver a plurality of quality levels in parallel within one or more streams. For video, for example, the quality can be expressed in number of domains, such as temporal domain (frames per second, for example), spatial domain (HD versus CIF, for example), and/or in quality (sharpness, for example). Video compression standards, for example, that can be used for multi quality streams are H.264 AVC, H.264 annex G (SVC), MPEG-4, etc. More information on compression standards such as H.264 can be found at the ITU Website www.itu.int, or at www.mpeg.org.
H.323 is an ITU standard. A reader who wishes to learn more about video conferencing standards and protocols is invited to visit the International Telecommunication Union (“ITU”) at the ITU Website: www.itu.int or at the Internet-Engineering-Task Force (IETF) Website: www.ietf.org. AVC multipoint conference system, MRC, MCU, an AVC endpoint, MRE, a Web conferencing client, and a VMR are well known to a person with ordinary skill in the art and have been described in many patents, patent applications and technical books. As such these will not be further described. Following are examples of patents and patent applications that describe videoconferencing systems: U.S. Pat. Nos. 6,496,216, 6,757,005, 7,174,365, 7,085,243, 8,411,595, 7,830,824, 7,542,068, 8,340,271, 8,228,363, and others.
In the two types of communication methods, the AVC and the MRC, a central entity is needed for handling signaling and the media streams (audio, video), an MCU or an MRM (respectively), for example. In order to establish a video conferencing system an endpoint can call a central unit such as an MCU or a virtual MCU. A virtual MCU (VMCU) can be a network device, a control server for example, that can communicate with a plurality of MCUs and a plurality of endpoints. A user initiates a reserve conference and/or an ad-hoc conference can communicate with the VMCU. If sufficient resources are available on one or more MCUs, the reservation is made and connection numbers are assigned. When the time for the conference arises, one or more MCU are assigned to the conference and the participants are then be connected to the conference. A reader who wishes to learn more about a VMCU is invited to read a plurality of patents and patent applications such as U.S. Pat. No. 7,174,365, U.S. Pat. No. 7,492,730, and many others. An example of a VMCU can be a product such as a DMA® sold by Polycom Inc. (DMA is a registered trademark of Polycom, Inc.)
After establishing the session each endpoint sends its media streams to an MCU or an MRM. The MCU or the MRM process the media stream according to the type of the communication methods and transfers the relevant streams to receiving endpoints. Along the description and the claims the term MCU can be used as a representative term for an MRM and a AVC MCU.
An MCU may comprise a Multipoint-Controller (MC) and a Multipoint-Processor (MP). The MC can be a packet-switch (SW) network entity that is located at the network that provides the signaling and control of three or more terminals participating in a multipoint conference. An example of packet SW network can be an IP network. The MC may also connect two terminals in a point-to-point conference, which may later develop into a multipoint conference. The MC provides capability negotiation with all terminals to achieve common levels of communications, and may also control conference resources. The MC signaling and control can be implemented by using a standard signaling protocol such as SIP. SIP stands for Session Initiation Protocol. A reader who wishes to learn more about SIP is invited to visit the IETF (Internet Engineering Task Force) web site: www.ietf.org. However, the MC does not perform mixing or switching of audio, video and data. The Multipoint Processor (MP) is a media entity on the network providing the centralized processing of audio, video, and/or data streams in a multipoint conference.
The MP provides the media processing such as decoding, mixing, composing, encoding, switching, routing or other processing of media streams under the control of the MC. The MP may process a single media stream or multiple media streams depending on the type of conference supported. A single MC can control a plurality of MPs.
Two common topologies are used in support of multi-point conferencing today:
A1) Centralized Topology (FIG. 1): with this method, all participants 110 send one or more media streams 120 up to a central media processing entity 130, and each receives one or more streams 140 from the same centralized entity 130. The streams 120 transmitted upstream to the centralized entity 130 can include one or more local camera feeds and one or more content feeds. The streams 140 transmitted back from the centralized entity 130 are rendered on screen and shown to the participant. When using a centralized approach, two flavors are used today:
A. Transcoding: where the central entity 130 transcodes all incoming and outgoing streams, typically using an MCU such as an AVC MCU. With this approach the centralized entity consumes a lot of compute resources per participant. This becomes an issue for scale and the budget needed for allocating such resources.
B. Media Relay: where the centralized entity 130, typically an MRM, relays all incoming and outgoing streams. With current relay deployments, the centralized entity 130 receives one or more streams 120 from each participant 110, and sends multiple streams 140 back down to that participant 110, so the participant 110 can see the other participants 110 in the call. This means that all media must flow through a single entity, which could become a bottleneck.
2) Mesh Topology (FIG. 2): with this method, streams 220 are sent peer-to-peer between the participants 210. Each participant 210 sends a copy of its stream(s) to each of the other participants 210 and receives media stream(s) from each other participant 210 in the session.
Each method carries its own limitations. Centralized topology sessions depend on heavy lifting media transcoding resources which are expensive and have scaling limitations. Mesh topology sessions require a good deal of CPU on each endpoint 210 for processing the streams being sent and received, and the total the amount of bandwidth required by each participant can also be substantial in order to have a successful experience.