Conventional videoconferencing systems comprise a number of end-points communicating real-time video, audio and/or data (often referred to as duo video) streams over and between various networks such as WAN, LAN and circuit switched networks.
In most high end video conferencing systems, high quality cameras with pan-, tilt-, and zoom capabilities are used to capture a view of the meeting room and the participants in the conference. The cameras typically have a wide field-of-view (FOV), and high mechanical pan, tilt and zooming capability. This allows for both good overview of a meeting room, and the possibility of capturing close-up images of participants and objects. The video stream from the camera is compressed and sent to one or more receiving sites in the video conference.
All sites in the conference receive live video and audio from the other sites in the conference, thus enabling real time communication with both visual and acoustic information.
During a video conference, participants at a local site often wish to share certain visual details of physical objects with the remote site. A typical example of this is the designer(s) of a product (e.g. a shoe) wants to discuss manufacturing problems with the manufacturer which is located on another continent. In order to show details of the manufacturing defect/challenges, the manufacturer can zoom in on the product (the shoe) and point at points/areas on the product while discussing how to solve the problem with the designer. In other situations, participants may want to share information only accessible on paper, like images, diagrams, drawings or even text. Today's high quality video conference cameras are certainly capable of providing close up images of such objects. However, in order to show such details of objects, the local user must manually adjust the cameras pan, tilt and zoom to capture the desired view.
Adjustments to the camera are typically done using a standard input device, such as a keypad on a remote control or a mouse by manually controlling the cameras pan, tilt and zoom. Typically a traditional IR remote control with standard push-buttons is used to adjust the camera. A standard setup is a set of four arrow keys to control the pan and tilt, and a zoom-in and zoom-out button to control the zoom.
Manually adjusting the cameras pan/tilt/zoom to capture such small details, as described above, is a tedious and time consuming process. First, a user must activate camera control by navigating through several on-screen menu's provided by the video conference system. Secondly, when camera control is activated, a user must manually adjust the camera using the arrow keys on the remote control. This is often an iterative process of alternately adjusting the zoom and pan/tilt.
Further, even though the camera's pan-tilt mechanism includes small step motors (allowing “high resolution” movement), the video conferencing system is often configured to move the camera in steps to spare the user from excessive key pushing. This works as intended when the camera is in a wide FOV. However it may cause trouble when the camera is zoomed in since the steps then become quite large.
Therefore, finding the optimal camera adjustment for known systems often require several iterations of pushing buttons on a remote control and/or an on-screen menu system, which makes it cumbersome, distractive and time-consuming.