Conventional video conferencing techniques typically employ a camera mounted at one location and directed at a user. The camera acquires an image of the user and background of the user that is then rendered on the video display of another user. The rendered image typically depicts the user, miscellaneous objects, and background that are within the field-of-view of the acquiring camera. For example, the camera may be mounted on the top edge of a video display within a conference room with the user positioned to view the video display. The camera field-of-view may encompass the user and, in addition, a conference table, chairs, and artwork on the wall behind the user, (i.e., anything else within the field-of-view). Typically, the image of the entire field-of-view is transmitted to the video display of a second user. Thus, much of the video display of the second user is filled with irrelevant, distracting, unappealing, or otherwise undesired information. Such information may diminish the efficiency, efficacy, or simply the esthetic of the video conference. This reduces the quality of the user experience.
Conventional chat sessions involve the exchange of text messages. Mere text messages lack the ability to convey certain types of information, such as facial features, other gestures, or general body language expressed by the participants. Conventional video conferencing techniques may convey images of the participants, but, as discussed above, the video conferencing medium has several shortcomings.
Furthermore, typical video conferencing and chat techniques do not incorporate the user with virtual content (e.g., text) being presented, and the traditional capture of the user and surrounding environment is usually unnatural and unattractive when juxtaposed against virtual content. Such a display further removes the exchange from conveying the impression that the exchange is face-to-face.