1. Field of the Invention
The present invention relates generally to videoconferencing and more particularly to providing a method of improving the visual effect of moving a videoconferencing camera from one position (e.g., trained on a first videoconference participant) to a second position (e.g., a second video conference participant).
2. Description of Related Art
Videoconferencing systems have become relatively widespread in commercial and other applications. One reason for the proliferation of videoconferencing is that it provides a level of interactivity comparable to face-to-face meeting without the attendant expense of travel for conference participants that are not in the same physical location. The effectiveness of videoconferencing depends in part on providing an experience to the user that is comparable to face-to-face meeting. One way is which this is performed is through controlling the view of the videoconferencing camera to present an image to the remote users that corresponds to where they would focus there attention if they were in a face to face meeting. For example, when one person is speaking, the camera may zoom in to a head and shoulders view of the speaker. Alternatively, when someone at the local end of a conference is speaking, the camera at the remote end may zoom out so that the speaker can see all of the remote conference participants. All of this is accomplished through pan, tilt, and zoom control of the videoconferencing camera.
This camera motion may be controlled manually. In such an application an operator, who may be one of the conference participants, uses some form of interface to manually adjust the pan, tilt, and zoom of either a local or remote camera. This typically takes the form of an arrow keypad or joystick type control. Often these controls (i.e., the camera controller) interface with the videoconferencing system, and may take the form of hardware, software or some combination thereof. However, in many cases camera motion is controlled automatically or semi-automatically. For purposes of the following description, automatic camera control will be used to refer to both types. In semi-automatic camera control, a variety of views are defined in memory as “preset” locations.
If it is known that two people at one end of a videoconference will be speaking during the course of the conference, a close up view of each of them is defined as well as a broader, zoomed out view of the entire endpoint of the conference. A videoconference operator can then select between these camera positions by the simple push of a button, which will interface with the camera controller to point the camera to one of the preset viewpoints. These preset views may either be set up during the course of the videoconference or they may be set manually during the course of the videoconference and saved as preset positions so that the operator can return to that position at future times by pushbutton rather than manual adjustment. Again, the camera controller will be some combination of hardware and software that either interfaces with the videoconferencing system or is part of the videoconferencing system.
Alternatively, for automatic camera control, the videoconferencing system typically includes a camera controller that is capable of processing the audio signal picked up by microphones and the video signal generated by the camera for determining the location of the speaker. Thus the videoconferencing system can determine the precise location of the speaker's face and can automatically “zoom in” on this location. Exemplary systems for accomplishing this are disclosed in U.S. Pat. No. 5,778,082 entitled “Method And Apparatus For Localization Of An Acoustic Source; U.S. Pat. No. 6,593,956 entitled “Locating An Audio Source” and co-pending U.S. patent application Ser. No. 10/004,070 entitled “Automatic Camera Tracking using Beamforming,” which are hereby incorporated by reference in their entirety.
In prior art videoconference systems incorporating automatic or semi-automatic camera control, the camera typically moves from one preset position to another as follows: the camera is panned all the way to the next pan (and/or tilt) position and then zoomed to the next zoom position. Alternatively, the camera may be zoomed to the next zoom level and then panned (and/or tilted) to the next pan (and/or tilt) position. In either case, this type of camera motion is aesthetically desirable only when the positions of the two presets are relatively close together. In many cases, two preset positions are not sufficiently close, making such camera motion is undesirable. For example, if the current camera position is zoomed in all the way, for example, to show a close-up of a conference participant, when moving to the next preset, the camera pans through the positions between the first and second preset at full zoom. Often, this results in too large images of conference participants and/or other objects, which is a less than pleasing aesthetic effect.
Professional video productions avoid this undesirable effect in such an instance by zooming out from the first position, panning from the zoomed out position to the new position, and then zooming in to the second position. While this technique does alleviate the undesirable aesthetic effect of the camera motion described in the preceding paragraph, it requires the intervention of a professional camera operator, which is frequently unavailable for most videoconferences and which adds significant expense to videoconferencing even if available.
Therefore what is needed in the art is a technique that accomplishes more desirable transitions between preset camera positions in a videoconference without requiring the intervention of a professional camera operator. Disclosed herein is such a system.