A videoconference “call” comprises a connection of two or more videoconferencing endpoints through a network for a period of time.
Generally, this comprises a logical connection typically through a packet-based network using Internet Protocol (IP). A multi-point videoconference comprises a call with more than two endpoints. An endpoint is a videoconferencing location (e.g., a room) that comprises resources such as cameras, displays and codecs that allow a videoconferencing system to collect and display video and audio, and to send and transmit video and audio streams across a network. Generally, a codec is a device configured to compress and decompress video and for sending and receiving compressed video streams over a network. A multi-point control unit (MCU) is an intermediary network device that handles routing in a multipoint call.
Correct eye-gaze can be a challenge in multi-screen, multi-camera and multi-point video as used in videoconferencing system. To acquire the correct viewpoints, the video cameras should be carefully placed. When a participant in a conference can see several others on a large display surface, those others may be confused as to who the participant is looking at. The position of the camera determines the viewpoint. Ideally, each participant would see a unique view of the other participants, from his or her particular perspective.
Endpoint configuration comprises arrangement of the chairs, tables, cameras, screens, network, and video processing components at a videoconferencing endpoint. In some systems, the physical positions that the conference participants are permitted to occupy are fixed and specified. Other parts of an endpoint configuration can change from call to call. For different call situations, some of the resources can be reconfigured. For instance, if a local videoconference facility has unused positions, it is possible that participants in remote locations may see an empty chair or unoccupied position. While other solutions have been suggested for determining participant presence and location in a videoconference such as motion detection, chair sensors, or presence monitoring with RFID or ID badges, each of these have drawbacks. For example, motion detection may not be able to accurately determine the number of participants or be able to discern participants closely seated. Chair sensors may require specialized equipment and decrease the ability to move chairs or reconfigure seating positions and RFID or ID badges require external infrastructure and personal encumbrance.
Call configuration comprises the routing of video streams from endpoint to endpoint. In a simple, two-point videoconference, video is streamed from point A to point B, and from point B to point A, but in other instances video is streamed to and from more than two endpoints. It is to be appreciated that there can be a plurality of video streams to and from each endpoint depending upon the number of cameras, views, displays and participants at each endpoint.
Because of network bandwidth limitations, it may not be possible to send and receive all views of all participants of each endpoint involved in a multipoint videoconference. Further, endpoint configuration parameters such as camera position, participant location, and camera selection may need to be reconfigured to better facilitate the videoconference. Therefore, it is desirable to have a multipoint conference system which would automatically choose or suggest placement of cameras and selection of video streams to improve participant gaze.