Current video conferencing technology typically uses a single camera to capture RGB data (from the red, blue, and green (RGB) color model) of a local scene. This local scene typically includes the people that are participating in the video conference, or meeting participants. The data then is transmitted in real time to a remote location and then displayed to another meeting participant that is in a different location than the other meeting participant.
While advances have been made in video conferencing technology that help provide a higher definition capture, compression, and transmission, typically the experience falls short of recreating the face-to-face experience of an in-person conference. One reason for this is that the typical video conferencing experience lacks eye gaze and other correct conversational geometry. For example, typically the person being captured remotely is not looking into your eyes, as one would experience in a face-to-face conversation. This is because their eyes are not looking where the camera is located and instead are looking at the screen. Moreover, three-dimensional (3D) elements like motion parallax and image depth, as well as the freedom to change perspective in the scene are lacking because there is only a single, fixed video camera capturing the scene and the meeting participants.