1. Technical Field
This disclosure relates in general to the field of video conferencing and telepresence systems. More specifically, this disclosure relates to a method, a device and a computer system for processing images in a conference between a plurality of video conferencing terminals.
2. Discussion of the Background
Conventional video conferencing systems include a number of video conferencing terminals (endpoints) communicating real-time video, audio and/or data (often referred to as duo video) streams over and between various networks such as WAN, LAN and circuit switched networks.
A number of video conference systems residing at different sites may participate in the same conference, most often, through one or more MCU's (Multipoint Control Units) performing, amongst other tasks, switching and mixing functions to allow the audiovisual terminals to intercommunicate properly.
Video conferencing systems presently provide communication between at least two locations for allowing a video conference among participants situated at each station.
Telepresence systems are enhanced video conference systems. Typically, terminals in telepresence systems have a plurality of large scaled displays for life-sized video, often installed in rooms with interiors dedicated to and tailored for video conferencing, all to create a conference as close to personal (i.e. in person) meetings as possible. The terminals in telepresence systems are provided with one or more cameras. The outputs of those cameras are transmitted along with audio signals to a corresponding plurality of displays at a second location such that the participants at the first location are perceived to be present or face-to-face with participants at the second location.
A display device of a video conferencing device, in particular a video conferencing terminal 100 of the telepresence type, is shown in FIG. 1A as arranged in front of a plurality of (four illustrated) local conference participants 102. The local participants 102 are located along a table 104, facing the terminal 100 which includes a plurality of display screens 106. In the illustrated example, three display screens 106 are included in the display device. A first, a second and a third display screens 106 are arranged adjacent to each other. The first, second and third display screens 106 are used for displaying images captured at one or more remote conference sites of a corresponding telepresence type.
A fourth display screen 108 is arranged at a central position below the second display screen 106. In a typical use, the fourth screen 108 may be used for computer-generated presentations or other secondary conference information. Alternatively, the fourth screen 108 is replaced by several table mounted displays 110, as shown in FIG. 1B. Video cameras 112 are arranged on top of the upper display screens 106 in order to capture images of the local participants 102, which are transmitted to corresponding remote video conference sites.
A purpose of the setup shown in FIGS. 1A and 1B is to give the local participants 102 a feeling of actually being present in the same meeting-room as the remote participants that are shown on the respective display screens 106.
Key factors in achieving a feeling of presence are the ability to see at whom the remote participants are looking, that all the participants are displayed in real life size and that all displayed participants appear equally sized relative to each other. Another provision for achieving high quality telepresence is that the images of the remote participants are presented to each local participant as undistorted as possible.
In order to obtain this feeling of presence, a special set of rules, or a proprietary protocol, is used by telepresence systems. Therefore, a telepresence system, such as the ones shown in FIGS. 1A and 1B, will operate properly only with other telepresence systems supporting that set of rules (or protocol). This is further complicated by the fact that different telepresence systems can employ different numbers of display screens, e.g. one, two, three or four display screens. Finally, more than two telepresence systems can participate in a conference, and all the participants will still expect the same feeling of presence as with a two systems conference.
Further, since there has not been defined a standard protocol for telepresence systems, only telepresence systems from the same manufacturer tend to interoperate in a satisfactory way.
In many situations there is also a need to call, or receive a call from, a telepresence system from, e.g., a regular video conferencing terminal even though the regular video conferencing terminal does not provide the same feeling of presence.
U.S. Pat. No. 7,034,860, which is incorporated herein by reference in its entirety, describes an apparatus and method for dynamically determining an image layout based on the numbers of participants or video sources connected to a conferences. The system combines each video source into a composite video signal according to the defined composite image layout, and transmits this composite signal to the connected sites. This works well with single screen systems, but a problem arises when two multi-screen telepresence systems with different numbers of screens are connected, more than two multi-screen telepresence systems are connected, and/or when single screen systems are connected to one or more multi-screen systems.
Thus, there is a need in the art to allow different types of telepresence endpoints (e.g. different manufacturers, different numbers of screens/cameras, etc.) to work well together in the same video conference.