Typical prior art videoconferencing systems fall into one of two categories: those where the intelligence is centralized in the coder-decoder (codec) or a system control unit; and those where the intelligence is distributed so that each peripheral device controller has the intelligence necessary to directly control other peripheral devices in the system. One shortcoming of centralized intelligence systems is that such systems are not readily adaptable to accommodate new devices and new versions of existing devices. The addition of another peripheral device beyond the number originally planned for, or the addition of a new type of peripheral device, can require a substantial investment in time and money to accommodate the desired additional device or new device. Furthermore, most centralized intelligence systems have a limited capacity with respect to the number of ports available to connect to peripheral devices. Once this capacity has been reached, new devices can be added only by removing existing devices, such as lesser used devices, or by obtaining another codec PG,3 or system controller which can accommodate the increased number of devices.
Distributed intelligence systems, such as that shown in U.S. Pat. No. 5,218,627 to Corey, have the shortcoming in that each peripheral device controller must have the intelligence necessary to control every type of peripheral device connected to the network, and every additional peripheral device must have a peripheral device controller which has the intelligence necessary to control all the existing devices on the network. Therefore, the addition of a new type of peripheral device requires new programming to be provided for each of the existing peripheral device controllers, and requires programming of the controller for the new type of device to accommodate the existing peripheral devices.
Therefore, there is a need for a videoconferencing system which can readily accommodate both additional peripheral devices and new types of peripheral devices.
Positioning of video cameras is required for videoconferencing as well as for a number of other activities, such as surveillance. The terms pan, tilt, zoom and focus are industry standards which define the four major axes for which a camera may be adjusted. Traditional camera positioning provides for manual adjustment of these axes, as well as buttons which provide for automatically positioning the camera to a preset location. A preset function recalls the pan, tilt, zoom and focus settings that have been previously ascertained and stored for that preset location.
Traditional videoconferencing systems provide for rather rudimentary control of these camera functions. That is, the user has a control panel for manually controlling camera functions, such as buttons for up/down, left/right, zoom in/out, and focus. The user can also typically select one of several preset camera settings so that, by the press of a single button, the camera will automatically position and focus itself at some preselected target. Of course, the preset function requires planning because the camera must be manually adjusted for the preset, and then the settings stored. The preset button then merely recalls these settings and adjusts the camera accordingly. If a location has not been preset then the user must manually adjust the pan, tilt, zoom, and focus settings for that location.
However, these controls are not intuitively obvious or easy to use, partly because the user may think that the camera should pan in one direction to center an object whereas, because of the position of the camera with respect to the user and the object, which object may be the user, the camera should actually move in the opposite direction. For example, the user typically sits at a table and faces the camera, and beside the camera is a monitor screen which allows the user to see the picture that the camera is capturing. If the user is centered in the picture, and wishes the camera to center on his right shoulder, the user may think that he wants the camera to pan left because, on the screen as seen by the user, the user's right shoulder is to the left of the user's center. However, the camera should actually pan to the right because, from the camera's viewpoint, the user's right shoulder is to the right of the user's center.
Also, current manual camera positioning techniques typically use a fixed motor speed. This results in the panning being too rapid and the scene flying by when the camera is zoomed in on an object, or in the panning being too slow and the scene taking a prolonged time to change to the desired location when the camera is in a wide field of view setting (zoomed out).
Furthermore, in traditional videoconferencing systems, when the camera is moving from to a preset location the pan and tilt systems move at the same rate. If the required pan movement is different than the required tilt movement then the camera will have completed its movement along one axis before it has completed its movement along the other axis. This makes the camera movement appear to be jerky and unnatural.
After the user has completed the process of changing the camera position the user may have to refocus the camera. As chance would have it, the first attempt to refocus the camera usually is in the wrong direction. That is, the user inadvertently defocuses the camera. The learning process is short, but the need to focus creates delays and frustration.
When the system has multiple cameras which are subject to control by the user, typical systems require the user to use buttons on the control keyboard to manually select the camera to be controlled, and/or assigning separate keys to separate cameras. Frequently, the user will select the wrong camera, or adjust the wrong camera.