In most video conferencing systems, high quality cameras with pan-, tilt-, and zoom capabilities are used to frame a view of the meeting room and the participants in the conference. The video stream from the camera is compressed and sent to one or more receiving sites in the video conference. All sites in the conference receive live video and audio from the other sites in the conference, thus enabling real time communication with both visual and acoustic information.
Adjustments to the camera may be made both before and during the video conference to display an optimal view of a site typically to show some of all participants present at a particular site. These adjustments may be done manually via a remote control, either by controlling the camera pan, tilt and zoom, or by choosing between a set of predefined camera positions. Other ways of automatically adjusting the camera rely on image and/or audio analysis. However, these conventional systems require repetitive inputs from a user or complex image and audio analysis. None of them describe a simplified system enabling a user to choose which customized view of a video conferencing site to send to the other sites.