Virtual reality (“VR”) technology simulates a user's physical presence in a synthetic, computer-generated environment. Often, VR equipment includes a VR headset with a display screen and headphones to present realistic images and sounds for the computer-generated environment. With VR technology, a user can look around the computer-generated environment and, in many cases, navigate through and interact with features of the environment. In some VR systems, a user can communicate with one or more other, remote users, who are virtually present in the environment. Similarly, augmented reality (“AR”) technology layers artificial, computed-generated content over camera input and/or audio input, thereby blending what the user sees and hears in their real surroundings with the artificially-generated content.
Although some VR systems generate a synthetic environment from scratch, many VR systems and AR systems create a three-dimensional (“3D”) model of an actual, physical environment using equipment that captures details of the physical environment. For example, the locations of objects in the physical environment are estimated using depth cameras from different viewpoints (perspectives) around the physical environment. In “depth maps” for a frame (e.g., associated with a timestamp or time slice), the depth cameras capture depth values that indicate how far objects are from the respective depth cameras. From these depth values in depth maps for the frame, the VR/AR system creates a 3D model of the physical environment and objects in it. The 3D model can represent the surfaces of objects as vertices in 3D space, which are connected to form triangles that approximately cover the surfaces of the objects. The VR/AR system can track the 3D model as it changes over time, from frame-to-frame, generating a dynamic 3D model, which provides a deformable, volumetric representation of the objects. A dynamic 3D model—adding the dimension of time to 3D spatial reconstruction—can provide the foundation for 4D reconstruction technology in diverse applications such as 3D telepresence for business conferencing or personal communication, broadcasting of live concerts or other events, and remote education. Typically, a VR/AR system also uses video cameras to record texture details (e.g., color values) for the surfaces of objects in the physical environment from different viewpoints. Such texture details can be applied (“textured” or “stitched”) to surfaces of a dynamic 3D model when producing (“rendering”) views of the 3D model for output.
In recent years, VR technology and AR technology have become more common. The commoditization of VR technology and AR technology, together with the availability of inexpensive depth camera technology, have increased interest in ways to generate accurate, dynamic 3D models of a physical environment with spatiotemporal performance capture. In particular, some recent systems are able to fuse multiple depth maps in real-time into dynamic 3D models that provide deformable, volumetric representations of objects in a physical environment. Although such systems can produce high-fidelity dynamic 3D models in many scenarios, when depth values are unreliable or unavailable, the resulting dynamic 3D models may be inaccurate or incomplete. Also, in some situations, previous approaches to stitching texture details from different viewpoints onto a dynamic 3D model may produce blurred details or noticeable seams, which can detract from overall quality, especially for regions of interest such as faces. Finally, previous approaches to rendering views of a dynamic 3D model, especially for real-time applications, are limited with respect to how artistic effects may be applied to the rendered views.