Human perception of three-dimensional (3D) vision relies up various clues in the images obtained by the eyes (more precisely, the retinas) of the human viewer. The terms “stereopsis” (or binocular parallax) and “motion parallax” refer to two categories of these clues. More particularly, binocular parallax refers to those clues arising from the different images of the same scene obtained by each of the eyes of the human viewer. For instance, because the eyes are positioned a certain distance apart, the right eye obtains an image as seen from a position to the right of the left eye and vice versa. Optical regions in the human brain interpret the difference in these two images and derive a perception of three-dimensional vision there from.
Motion parallax refers to those clues arising from different images of the same scene obtained by one (or both) eyes of a viewer as that viewer's position relative to the scene changes. In other words, as the viewer moves relative to the scene, differing portions of the scene become visible to, or hidden from the viewer. More particularly, as the viewer moves relative to the scene, objects in the scene which are closer to the viewer appear to move farther than more distant objects. Indeed at times, closer objects will eclipse more distant objects as the viewer moves. At other times, distant objects will emerge from behind closer objects as the viewer moves.
A conventional 3D display system typically includes special glasses, a virtual reality helmet, or some other user attachable device. The user attachable device provides cues and feedback information corresponding to the relative positions of the viewer's eyes to track positional changes of the eyes. These display systems display images for binocular viewing based on the positional feedback from the user attachable device.
Existing teleconferencing systems, gaming systems, a virtual user interaction system, and other viewing systems utilized conventional display systems, such as those discussed previously, to partially provide a 3D depiction of video images. While such viewing provides somewhat realistic 3D interaction, many users would prefer to be unencumbered by the user attachable devices. Accordingly, users of such display systems may prefer a system that locates and tracks a user without a user attachable device. Moreover, the user attachable devices, tend to increase the complexity and cost of these systems beyond that which some users are willing to tolerate. Furthermore, when the depicted scene is a real-world scene (i.e., not computer generated), imprecision in the depth estimate may produce artifacts in the generated image.