Videoconferencing enables individuals located at different locations to have face-to-face meetings on short notice using audio and video telecommunications. A videoconference may involve as few as two sites (point-to-point) or several sites (multi-point). A single participant may be located at a conferencing site, or there may be several participants at a site, such as at a conference room. Videoconferencing may also be used to share documents, information, and the like.
Participants in a videoconference interact with participants at other sites via a videoconferencing endpoint. An endpoint is a terminal on a network, capable of providing real-time, two-way audio/visual/data communication with other endpoints or with a multipoint control unit (MCU). An endpoint may provide speech only, speech and video, or speech, data and video communications, etc. A videoconferencing endpoint typically comprises a display unit on which video images from one or more remote sites may be displayed. Example endpoints include POLYCOM® VSX® and HDX® series, each available from Polycom, Inc. (POLYCOM, VSX, and HDX are registered trademarks of Polycom, Inc.). The videoconferencing endpoint sends audio, video, and/or data from a local site toward a remote site(s) and displays video and/or data received from the remote site(s) on a screen.
Video images displayed on a screen at a videoconferencing endpoint may be arranged in a layout. The layout may include one or more segments for displaying video images. A segment is a portion of the screen of a receiving endpoint that is allocated to a video image received from one of the sites participating in the session. For example, in a videoconference between two participants, a segment may cover the entire display area of the screen of the local endpoint. Another example is a video conference between a local site and multiple other remote sites where the videoconference is conducted in switching mode, such that video from only one other remote site is displayed at the local site at a single time and the displayed remote site may be switched, depending on the dynamics of the conference. In contrast, in a continuous presence (CP) conference, a conferee at a terminal may simultaneously observe several other participants' sites in the conference. Each site may be displayed in a different segment of the layout, where each segment may be the same size or a different size. The choice of the sites displayed and associated with the segments of the layout may vary among different conferees that participate in the same session. In a CP layout, a received video image from a site may be scaled down or cropped in order to fit a segment size.
An MCU may be used to manage a videoconference. An MCU is a conference controlling entity that may be located in a node of a network, in a terminal, or elsewhere. The MCU may receive and process several media channels from access ports according to certain criteria and distributes these media channels to the connected channels via other ports. Examples of MCUs include the MGC-100 and RMX® 2000, available from Polycom Inc. (RMX 2000 is a registered trademark of Polycom, Inc.). Some MCUs are composed of two logical units: a media controller and a media processor. A more thorough definition of an endpoint and an MCU may be found in the International Telecommunication Union (“ITU”) standards, such as but not limited to the H.320, H.324, and H.323 standards. Additional information regarding the ITU standards may be found at the ITU website www.itu.int.
To present a video image within a segment of a screen layout of a receiving endpoint (site), the entire received video image may be manipulated, scaled down and displayed, or a portion of the video image may be cropped by the MCU and displayed. An MCU may crop lines or columns from one or more edges of a received conferee video image in order to fit it to the area of a segment in the layout of the videoconferencing image. Another cropping technique may crop the edges of the received image according to a region of interest in the image, as disclosed in U.S. Pat. No. 8,289,371, the entire contents of which are incorporated herein by reference.
In a videoconferencing session, the size of a segment in a layout may be defined according to a layout selected for the session. For example, in a 2×2 layout each segment may be substantially a quarter of the display, as illustrated in FIG. 1. Layout 100 includes segments 112, 114, 116 and 118. In a 2×2 layout, if five sites are taking part in a session, conferees at each site typically may see the other four sites.
In a CP videoconferencing session, the association between sites and segments may be dynamically changed according to the activity taking part in the conference. In some layouts, one of the segments may be allocated to a current speaker, and other segments may be allocated to other sites, sites that were selected as presented conferees. The current speaker is typically selected according to certain criteria, such as the loudest speaker during a certain percentage of a monitoring period. The other sites (in the other segments) may include the previous speaker, sites with audio energy above the others, certain conferees required by management decisions to be visible, etc.
In the example illustrated in FIG. 1, only three quarters of the area of the display are Used—segments 112, 114, and 116—and the fourth quarter 118 is occupied by a background color. Such a situation may occur when only four sites are active and each site sees the other three. Furthermore, segment 116 displays an empty room, while the sites presented in segment 112 and 114 each include a single conferee (conferees 120 and 130). Consequently, during this period of the session only half of the screen area is effectively used and the other half is ineffectively used. The area of segments 116 and segment 118 do not contribute to the conferees' experience and therefore are not exploited in a smart and effective manner.
Furthermore, as may be seen in both segment 112 and 114, a major area of the image is redundant. The video images capture a large portion of the room while the conferees' images 120 and 130 are small and located in a small area. Thus, a significant portion of the display area is wasted on uninteresting areas. Consequently, the area that is captured by the conferees' images is affected and the experience of the conferees viewing the layout of the video conference is not optimal.
Moreover, in some conference sessions, one or more of the sites have a single participant, while in other sites there are two or more participants. In currently available layouts, each site receives similar segment sizes and as a result, each participant at a site with a plurality of conferees is displayed over a smaller area than a conferee in a site with fewer participants, degrading the experience of the viewer.
In some videoconferencing sessions, there may be sites with a plurality of conferees where only one of them is active and does the talking with the other sites. Usually the video camera in this room captures the entire room, with the plurality of conferees, allocating a small screen area to each one of the conferees including the active conferee. In other sessions content (data) may be presented as part of the layout, typically in one of the segments independently from the video images presented in the other segments.
If during a conference call one of the conferees steps far from the camera, that conferee's image will seem smaller and again the experience of the conferees viewing the layout of the video conference is degraded. Likewise, if the conferees at a displayed site leave the room for a certain time and return afterward, the empty room is displayed on the layout during the conferees' absence.
In some known techniques, the viewing conferees at the other sites may manually change the layout viewed at their endpoints to adjust to the dynamics of the conference, but this requires the conferees to stop what they are doing and deal with a layout menu to make such an adjustment.