As traffic over Internet Protocol (IP) networks continues its rapid growth, with the growth of the variety of multimedia conferencing equipment, more and more people use multimedia conferencing as their communication tool. Videoconferencing sessions often require content to be presented with the video image. Business meetings, educational sessions, lectures, marketing presentations, professional meetings (such as design reviews), etc. typically require content presentation. Different types of content such as EXCEL® tables, POWERPOINT® presentation, slides, charts, drawings, etc. can be presented during a video conferencing session. (EXCEL and POWERPOINT are registered trademarks of Microsoft Inc.)
Usually the content is important for understanding the current discussion; therefore, the content is delivered to all the conferees. Some video conference sessions cover a large number of conferee's endpoints with different capabilities, over a large number of connection types having different quality. Further, some of the conferences are served by two or more Multipoint Control Units (MCUs) in cascading, etc.
An MCU is a conference controlling entity that is typically located in a node of a network or in a terminal that receives several channels from endpoints. The MCU processes audio and visual signals according to certain criteria and distributes the signals to a set of connected channels. Examples of MCUs include the MGC-100™ and the RMX 2000®, which are available from Polycom, Inc. (MGC-100 is trademark of Polycom, Inc. RMX 2000 is a registered trademark of Polycom, Inc.) A terminal, which may be referred to as an endpoint (EP), is an entity on the network, capable of providing real-time, two-way audio and/or video and/or content visual communication with another EP or with an MCU. A more thorough definition of an EP and an MCU can be found in the International Telecommunication Union (ITU) standards, such as but not limited to the H.120, H.324, and H.323 standards, which can be found at the ITU website: www.itu.int. Information about Session Initiation Protocol (SIP) can be found at the Internet Engineering Task Force (IETF) web site, www.ietf.org.
Usually, at a delivery EP, content can be compressed by an encoder other than the encoder that is used for the video image. In most cases, the frame rate used by the encoder to compress the content is low, for example, 1 to 10 frames per second. The compressed content can be sent from the delivery EP toward an MCU over a separate stream using a different channel than the EP video image. From the MCU, the content can be sent toward one or more receiving EPs as Video Switching (VSW) images over a separate stream other than the continuous presence (CP) video image of the conference. The parameters of the content encoder are negotiated to the highest common parameters. In some video conferences, the MCU may transcode the content that is sent toward one or more receiving EPs. Further, for some limited EPs that cannot handle the content as a separate video stream, the MCU may treat the content as a video image from an EP and may add it to a segment in the CP video image that is targeted toward the limited receiving EPs.
A user at a receiving EP that receives a CP video image can simultaneously observe several other participants in the conference. Each participant may be displayed in a segment of the layout, where each segment may be in the same size or a different size. The choice of the participants displayed and associated with the segments of the layout may vary among different conferees that participate in the same session. A user at a receiving EP that receives a VSW image can observe only one other participant in the conference. The one other participant may be displayed over the entire screen of the receiving EP.
The blend of a plurality of conferees, a plurality of endpoints qualities and a plurality of connections of different quality levels increases the frequency of the missing packets. Usually, the missing packets are followed by Intra requests for the content data. Most of the Intra requests are relevant to few endpoints or even to one endpoint but they affect all the endpoints, because the content is commonly distributed from a single encoder to a plurality of the endpoints.
Common video compression methods use Intra and Inter frames. An Intra frame is a video frame that is compressed relative to information that is contained only within the same frame and not relative to any other frame in the video sequence. An Inter frame is a video frame that was compressed relative to information that is contained within the same frame and relative to one or more other frames in the video sequence.
Compressing a video frame relative to information that is contained only within the same frame requires more computing time and delivers more data. Therefore, common encoders encode an Intra frame in low quality in sharpness and/or luminance, for example, and then improve the quality of each Inter frame. The result is a “breathing effect” in the sharpness and in the brightness, for example, of the content image. A plurality of Intra frames increases the “breathing effect.”
In order to eliminate the number of Intra requests, some techniques filter Intra requests from problematic endpoints, other may use a content transcoder (decoder/encoder) in an MCU that is connected to the endpoint that delivers the content instead of just switching the content. However, such a solution moves the problem to the encoder at the MCU. None of those methods reduces the “breathing effect” of many Intra requests for content in large conferences.