Free Viewpoint Video (FVV) is created from images captured by multiple cameras viewing a scene from different viewpoints. FVV generally allows a user to look at a scene from synthetic viewpoints that are created from the captured images and to navigate around the scene. In other words, in a FFV each end user can interactively generate synthetic (i.e., virtual) viewpoints of each scene on-the-fly while the video is being rendered and displayed. This creates a feeling of immersion for any end user who is viewing a rendering of the captured scene, thus enhancing their viewing experience.
The process of creating and playing back FVV or other 3D spatial video typically is as follows. First, a scene is simultaneously recorded from many different perspectives using sensors such as RGB cameras and other video and audio capture devices. Second, the captured video data is processed to extract 3D geometric information in the form of geometric proxies using 3D Reconstruction (3DR) algorithms which derive scene geometry from the input images. Three dimensional geometric proxies can include, for example, depth maps, point based renderings, or higher order geometric forms such as planes, objects, billboards, models or other high fidelity proxies such as mesh based representations. Finally, the original texture data (e.g., RGB data) and geometric proxies are recombined during rendering, for example by using Image Based Rendering (IBR) algorithms, to generate synthetic viewpoints of the scene.
Texture mapping is a method for adding detail, surface texture or color to a computer-generated 3D graphic or 3D model. Projective texturing is a method of texture mapping that allows a textured image to be projected onto a scene as if by a slide projector. For example, in FVV, the original scene image data (for example, RGB image data originally captured of the scene) can be recombined with the geometric proxies by applying the original scene images/texture data to the geometric proxies by using projective texture mapping. The geometric proxy is rendered to a virtual viewpoint and surface texture is sampled from adjacent camera images. Projective texturing uses the captured scene to create a depth map of the scene collocated with each original scene image (e.g. RGB image) which provides accurate calculations of how far objects in the scene are from a point of origin on the z-axis. A near-depth surface is closer to the point of origin on the z-axis than a far-depth surface. A depth discontinuity is defined by a jump between a near-depth surface and a far-depth surface
When creating three dimensional spatial video, such as, for example, Free Viewpoint Video, errors in the geometric proxy can cause errors in projective texturing leading to artifacts that reduce the image quality. For example, if the geometric proxy does not match the silhouette boundary of an object, low depth (e.g., near depth) textures can end up on high depth (e.g., far depth) surfaces.