In videotelephony, images and sound captured at a first location are combined and transmitted electronically to a recipient at a second location. In this way, people at the two different locations can communicate by both visual and audio means as though in the same room. Consequently, videotelephony is also sometimes referred to as telepresence.
Typically, a videotelephony system includes a video camera and one or more microphones. The camera captures video, which may then be digitized and transmitted along with sound captured by the microphones to a person or group of people elsewhere. Commonly, such systems are configured for two-way communication between multiple parties. Consequently, a typical videotelephony system will include a visual display and speakers to enable the receipt of video and sound from other locations concurrent with the transmission of video and sound from the local system.
Videotelephony systems generally transmit and receive video and sound over any of a variety of information networks, such as the Internet, local area networks (LANs), wireless networks, cellular networks, and standard telephone lines. With recent developments in internet and broadband networking, videotelephony technology has become increasingly popular in a variety of diverse applications.
Due to the dynamic nature of multi-user conferencing, the focus and attention of the attendees of any meeting may be periodically shifted from one participant to another as different participants desire to speak and/or display information. This can present specific challenges to a videotelephony system in the context of video conferencing.
For example, in many teleconferences, multiple participants may be present in one conferencing location with a single camera. Thus, the images of the participants that are captured and transmitted by the camera may make it difficult to determine which participant at that location is speaking, displaying information or should otherwise be the focus of attention for the participants at other locations.
In a videotelephony event, as in any meeting, the focus and attention may periodically shift from one participant or group of participants to another participant or group of participants. These shifts in focus and attention may occur when a new person starts speaking or when attention is to be directed to an object, display or other item that is being discussed. Such shifts will likely occur naturally several times throughout the course of a videotelephony meeting.
To follow the dynamics of the event, it may be desirable for a video camera at each of the conferencing locations to effectively capture video from a changing region of interest as the focus and attention of the meeting shifts among the participants or objects at that location. To assist those at other locations to follow the focal point of the discussion, it may be desirable to automatically frame a field of view of the video camera according to the dynamically changing region of interest.
For example, U.S. Pat. No. 5,268,734, entitled “Remote Tracking System for Moving Picture Cameras and Method,” describes a system in which a mobile remote unit that may be moved within the videoconference area is sensitive to infrared (IR) signals transmitted by a base or stationary unit. The stationary unit includes an IR transmitter placed behind a rotating lens resulting in an IR signal being “scanned” across the videoconference area. The remote unit detects when the peak strength of the signal occurs and provides this data via a radio frequency (RF) signal back to the base unit. Using this information, the effective angle between the axis of the IR signal and the remote unit is used to create an error signal with which the base unit can position the video camera mounted thereon.
It should be noted, however, that this and similar prior art systems require the use of an entirely separate system, such as the infrared transmitter and an RF receiver for tracking the mobile remote unit to obtain information for positioning the videotelephony camera. This adds further complexity and expense to the videotelephony system.
Throughout the drawings, identical reference numbers designate similar, but not necessarily identical, elements.