1. Field of the Invention
The present invention relates in general to object detection and tracking, and in particular to a system and method for automatically adjusting gaze and head orientation for video conferencing.
2. Related Art
In face-to-face communication, gaze awareness, and eye contact in particular, are extremely important. Gaze is a signal for turn-taking in conversation. Also, it expresses attributes such as attentiveness, confidence, and cooperativeness. People using increased eye contact typically receive more attention and help from others, can generate more learning as teachers, and have better success with job interviews, etc.
These face-to-face communications are being increasingly replaced by teleconferencing, such as videoconferencing. As a result, videoconferencing has become popular in both business and personal environments. Unfortunately, eye contact and gaze awareness are usually lost in most videoconferencing systems. This is because the viewer cannot tell where the gaze of any other videoconferencing participant is directed in typical systems that use a camera that is located on top of a display device where the user interface appears. Namely, traditional videoconferencing applications present participants in separate windows of the user interface in order to provide spatial graphical representation of each participant on the display device and sacrifice gaze awareness.
For example, in these systems, if participant A desires to communicate with participant B during the videoconferencing, the gaze of participant A will be directed at the spatial representation of participant B (i.e. at the image of B on the A""s display device). Since the viewpoint of the camera is typically not in line with the spatial representation of the participants (normally the camera is placed near or on top of the display device and not in the display device), participant A will be looking at the display device instead of participant B. Consequently, without gaze adjustments, as participant A is looking at the display device, and away from the camera and participant B, it is impossible for A to be perceived as looking directly out of B""s display device and at B.
Therefore, because a videoconferencing participant looks at the images on their display device or monitor, and not directly into the camera, the participants never appear to make eye contact with each other. In addition, for multi-participant videoconferencing, video for each participant is in an individual window, which is usually placed arbitrarily on the screen. Consequently, gaze awareness also does not exist in these systems because each participant does not appear to look at the participant or participants that are being addressed during a conference.
Thus, in these videoconferencing environments, gaze awareness will also not exist because eye-contact is not present between the participants. Without gaze awareness, videoconferencing loses some of its communication value and can become uninteresting. This is because facial gaze, i.e., the orientation of a person""s head, gives cues about a person""s intent, emotion, and focus of attention. As such, gaze awareness can play an important role in videoconferencing.
To resolve this problem, several attempts have been made to create gaze awareness and spatialized teleconferences using specialized hardware. One system is the Hydra system, which uses a small display/camera pair for each participant, placed far enough from the user so that each participant""s gaze at the display is virtually indistinguishable from gazing at the camera. Other systems have used half-silvered mirrors or transparent screens with projectors to allow the camera to be placed directly behind the display. However, these systems are expensive and hardware intensive.
Therefore, what is needed is a software system and method for automatically adjusting gaze and head pose in a videoconferencing environment. What is also needed is a videoconferencing system and method that restores gaze-awareness and eye-contact, and provides a sense of spatial relationship similar to face-to-face meetings with inexpensive software.
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention is embodied in a system and method for automatically adjusting gaze and head pose in a videoconferencing environment, where each participant has a camera and display.
In general, the images of participants are digitally rendered with a software module in a virtual 3D space. Next, head-pose orientation and eye-gaze direction are digitally corrected. The digital rendering and correction are preferably performed as internal mathematical computations or software operations without the need for a display device. As such, when the digital rendering and correction completed, the results are transmitted to a display screen so that a particular participant""s image in the 3D space appears to other participants as if the particular participant was looking at the person they are looking at on the screen. For example, if a participant is looking at the viewer, their gaze is set toward the xe2x80x9ccameraxe2x80x9d, which gives the perception of eye-contact.
Specifically, the software system includes a vision component and a synthesis component. The vision component is employed when the video is captured. The vision component detects the head pose relative to the display, the eye gaze relative to the display, and the outlines of the eyes. The synthesis component places the images of the participants in a virtual multi-dimensional space. The head-pose can then be moved in multi-dimensional space (swiveled) and the eye gaze to be set in any direction in the virtual multi-dimensional space. In addition, the eye gaze can be set to look directly at the xe2x80x9ccameraxe2x80x9d (viewpoint) of the multi-dimensional space, creating an impression of eye contact with anyone viewing the display.
The present invention as well as a more complete understanding thereof will be made apparent from a study of the following detailed description of the invention in connection with the accompanying drawings and appended claims.