1. Field of the Invention
The invention relates generally to videoconferencing systems and more specifically to camera tracking of videoconference participants.
2. Description of Related Art
When people wish to communicate from remote locations, a videoconferencing system is a convenient way to share ideas. Typically, there is more than one person participating in a videoconference. A camera must, therefore, be positioned to frame everyone in the room. However, such camera shots are impersonal and do not allow the recipient of the image to pick up on nuances of the speaker's facial expressions because they lack sufficient detail. Manually tracking the camera from one speaker to the next is inconvenient and distracting. There are a number of prior art patents in this field directed to solving this problem.
For example, U.S. Pat. No. 5,686,957 to Baker teaches automatic tracking by using a plurality of microphones and a crude audio detection circuit. Because the audio detection mechanism is limited in its ability to locate sound, a special camera is used to enhance the peripheral portion of the field of view. The camera rotates in approximately 30 degree increments in order to frame the general location of the speaker.
U.S. Pat. No. 5,778,082 to Chu teaches automatic tracking by using a plurality of microphones and a processor. The processor determines when an acoustic signal begins at each microphone, and then locates the direction of the source based on a comparison of data from each microphone. The system, however, requires processor intensive Fast Fourier Transform calculations to be made continuously.
U.S. Pat. No. 5,940,118 to Van Schyndel teaches automatic tracking using an optical transducer that takes visual cues (e.g., a moving mouth) to point the camera toward the location of the speaker. The method requires a very advanced processor, an optical transducer, and is subject to many false signals (e.g., one participant whispering to a neighbor).
U.S. Pat. No. 5,959,667 to Maeng teaches automatic tracking using a microphone array and a set of preset values. The preset values include a set of camera parameters that define a particular camera shot. The microphone array locates the position of the speaker, and the position is compared to the preset values. The camera then tracks to the closest preset position. The method also requires a very powerful processor because the speaker location has to be continuously calculated.
U.S. Pat. No. 6,005,610 to Pingali teaches a method for localizing a speaker based on concurrent analysis of a video image and plural microphone signals. The position of the speaker is tracked as the speaker changes position.
U.S. Pat. No. 5,438,357 to McNelley teaches a method for image manipulation that creates the impression of eye contact between video conferencing parties. The camera is manipulated to track a speaker such that their image is centered on a display.
U.S. Pat. No. 5,528,289 to Cortjens et al. teaches a system for controlling devices on a computer network. The pan, tilt, and zoom of each camera is controlled via a pointing device such as joystick or mouse.
U.S. Pat. No. 5,581,620 to Brandstein et al. teaches a method for enhancing the reception of signals received at several locations using beamforming techniques. The location of a speaker is calculated and the information is used to align the phase of signals from multiple audio detectors. The signals originating from the speaker's location are added while signals from other locations are attenuated.
U.S. Pat. No. 5,583,565 to Cortjens et al. teaches a camera system that can be remotely controlled over a computer network for the purpose of video conferencing. Preset operational parameters are stored within the camera to increase the ease of operation. A user input device is used to control the camera.
U.S. Pat. No. 5,844,599 to Hildin teaches a video system in which a camera is directed at a voice activated emitter associated with a speaker. The emitters are detected using an infrared position signal and a video camera is directed at the emitter's location.