Videoconferencing enables individuals located remotely one from the other to conduct a face-to-face meeting. Videoconferencing may be performed by using audio and video telecommunications. A videoconference may be between as few as two sites (point-to-point), or between several sites (multi-point). A conference site may include a single participant (user) or several participants (users). Videoconferencing may also be used to share documents, presentations, information, and the like.
Participants (users) may take part in a videoconference via a videoconferencing endpoint (EP). An endpoint (EP) may be a terminal on a network. An endpoint may be capable of providing real-time, two-way, audio/visual/data communication with other terminals and/or with a multipoint control unit (MCU). An endpoint (EP) may provide information/data in different forms, such as audio; audio and video; and data and video. The terms “terminal,” “site,” and “endpoint” may be used interchangeably and the description, drawings and claims of the present disclosure use the term “endpoint” as a representative term for above group.
An endpoint (EP) may comprise a display unit (screen) on which video images from one or more remote sites may be displayed. Example endpoints may be units of the POLYCOM® VSX® and HDX® series, each available from Polycom, Inc. (POLYCOM, VSX, and HDX are registered trademarks of Polycom, Inc.) A videoconferencing endpoint (EP) may send audio, video, and/or data from a local site to one or more remote sites, and display video and/or data received from the remote sites on its screen.
Video images displayed on a screen at an endpoint may be displayed in an arranged layout. A layout may include one or more segments for displaying video images. A segment may be a predefined portion of a screen of a receiving endpoint that may be allocated to a video image received from one of the sites participating in the videoconference session. In a videoconference between two participants, a segment may cover the entire display area of the screens of the endpoints. At each site, the segment may display the video image received from the other site.
Another example of a video display mode in a videoconference between a local site and multiple remote sites may be a switching mode. A switching mode may be such that video/data from only one of the remote sites is displayed on the local site's screen at a time. The displayed video may be switched to video received from another site depending on the dynamics of the conference.
In contrast to the switching mode, in a continuous presence (CP) conference, a conferee (participant) at a local terminal (site) may simultaneously observe several other conferees from different terminals participating in the videoconference. Each site may be displayed in a different segment of the layout, which is displayed on the local screen. The segments may be the same size or of different sizes. The combinations of the sites displayed on a screen and their association to the segments of the layout may vary among the different sites that participate in the same session. Furthermore, in a continuous presence (CP) layout, a received video image from a site may be scaled up or down, and/or cropped in order to fit its allocated segment size. It should be noted that the terms “conferee,” “user,” and “participant” are used interchangeably in this disclosure. The description, drawings, and claims of the present disclosure the term “conferee” may be used as a representative term for above group.
An MCU may be used to manage a videoconference. An MCU is a conference controlling entity that is typically located in a node of a network or in a terminal that receives several channels from endpoints and, according to certain criteria, processes audio and/or visual signals and distributes them to a set of connected channels.
Example MCUs may be the MGC-100, RMX 2000®, and RMX 4000®, available from Polycom Inc. (RMX 2000 and RMX 4000 are registered trademarks of Polycom, Inc.). Some MCUs may be composed of two logical units: a media controller (MC) and a media processor (MP). A more thorough definition of an endpoint (terminal) and an MCU may be found in the International Telecommunication Union (“ITU”) standards, such as the H.320, H.324, and H.323 standards. Additional information regarding the ITU standards may be found at the ITU website www.itu.int.
In a CP videoconferencing session, the association between sites and segments may be dynamically changed according to the activities taking part in the conference. In some layouts, one of the segments may be allocated to a current speaker. The other segments of that layout may be allocated to other sites that were selected as “presented sites” or “presented conferees.” A current speaker may be selected according to certain criteria, such as having the highest audio energy during a certain percentage of a monitoring period. The other “presented sites” may include the image of the conferee that was the previous speaker; the sites having audio energy above a certain thresholds; and certain conferees required by management decisions to be visible.
A received video image may be processed to meet a required segment size, resolution, etc. The video image may be processed by the MCU, including manipulation of the received video image, scaling up/down the image, and cropping a portion of the video image. An MCU may crop lines or columns from one or more edges of a received video image in order to fit it to an area of a segment in a certain layout. Another cropping technique may crop the edges of a received image according to a region of interest of the received image, as disclosed in co-owned U.S. patent application Ser. No. 11/751,558, the entire contents of which are incorporated herein by reference.
In a videoconferencing session, a size of a segment in a layout may be defined according to a layout type selected for the session. For example, in a 2×2 type layout each segment may be substantially a quarter of the display. If five sites are taking part in the session, then each conferee may view the other four sites simultaneously, for example.
In a CP videoconference, each presented site may be displayed over a portion of a screen. A participant typically prefers to see the video images from the other sites instead of his or her own video image. In a CP conference, each presented conferee is typically associated with an output port of an MCU. In some cases, a plurality of sites may receive a similar layout from an MCU, such as one of the layouts that are sent toward one of the presented conferees.
An output port typically comprises a CP image builder and an encoder. A CP image builder typically obtains decoded video images from each of the presented sites. The CP image builder may resize (scale and/or crop) the decoded video images to a required size of a segment in which the image will be presented. The CP image builder may further write the resized image in a CP frame memory in a location that is associated with the location of the segment in the layout. When the CP frame memory is completed with all the presented images located in their associated segments, then the CP image may be read from the CP frame memory by the encoder.
The encoder may encode the CP image. The encoded and/or compressed CP video image may be sent toward the endpoint of the relevant conferee. Output ports of an MCU are well known in the art and are described in multiple patents and patent applications. An example frame memory module may employ two or more frame memories, such as a currently encoded frame memory and a next frame memory. The memory module may alternately store and output video of consecutive frames. A reader who wishes to learn more about a typical output port is invited to read U.S. Pat. No. 6,300,973, which is incorporated herein by reference in its entirety for all purposes.
An output port typically consumes heavily computational resources, especially when the output port is associated with a high definition (HD) endpoint that displays high-resolution video images at a high frame rate. The resources needed for the output ports may limit the capacity of the MCU and have a significant influence on the cost of an MCU.
In order to solve the capacity/cost issue, some conventional MCUs offer a conference-on-port option, in which a single output port is allocated to a CP conference. In a conference-on-port MCU, all of the sites that participate in the session receive the same CP video image. Consequently, the presented conferees may see their own images.