As small digital video cameras have become less costly and more ubiquitous, showing up on consumer products such as laptops, PDA's, and cellular phones, video-conferencing has seen increasingly wide-spread usage. In order to gain widespread acceptance, video-conferencing faces two major difficulties. First, high frame-rate transmissions have heavy bandwidth requirements, and average users are often constrained to low frame-rates and poor quality transmission, even with the use of video compression algorithms. Second, basic video-conferencing lacks a feeling of common presence and shared space, fundamentally changing the dynamics of conversation and potentially causing the participants to feel uncomfortable [9].
Small digital video cameras have become increasingly common. appearing on portable consumer devices such as laptops, PDAs, and cellular phones. The widespread use of video-conferencing, however, is limited in part by the lack of bandwidth available on such devices. Also, video-conferencing can produce feelings of discomfort in conversants due to a lack of co-presence. The graphics literature offers a wide range of technologies intended to increase the feeling of co-presence, but many of these techniques are not practical in the consumer market due to the costly and elaborate equipment required (such as stereoscopic displays and multi-camera arrays).
More advanced video-conferencing systems introduce co-presence by inducing the perception of three-dimensionality. This can be accomplished using binocular disparity technologies such as stereoscopic displays [1] or augmented reality systems. For example, Prince et al. [10] and Nguyen et al. [8] use head-mounted displays to create a three-dimensional experience of an object. An alternate approach, motion parallax, approximates a three-dimensional experience by rotating a 3D model of the object based on the user's viewing angle. This has been reported to provide a greater level of presence than the binocular approach [2], but current implementations require the use of expensive motion-tracking technologies such as multi-camera, optical tracking arrays [6] or magnetic motion-capture systems [4]. Regardless of the ultimate display technology, whether binocular or motion-based, generating an image of the object to be displayed requires at least a two-camera system.