Three dimensional video technology continues to grow in popularity and 3D technology capabilities in the entertainment and communications industries in particular have evolved rapidly in recent years. Production studios are now developing a number of titles for 3D cinema release each year, and 3D enabled home cinema systems are widely available. Research in this sector continues to gain momentum, fuelled by the success of current 3D product offerings and supported by interest from industry, academia and consumers.
3D technology provides an observer with an impression of depth in a compound image, causing parts of the image to appear to project out in front of a display screen, into what is known as observer space, while other parts of the image appear to project backwards into the space behind the screen, into what is known as CRT space.
The term 3D is usually used to refer to a stereoscopic experience, in which an observer's eyes are provided with two slightly different images of a scene, which images are fused in the observer's brain to create the impression of depth. This effect is known as binocular parallax and is typically used in 3D films for cinema release. The technology provides an excellent 3D experience to a stationary observer. However, stereoscopic technology is merely one particular technique for producing 3D video images. Free viewpoint television (FTV) is a new audiovisual system that allows observers to view 3D video content while freely changing position in front of a 3D video display. In contrast to stereoscopic technology, which requires the observer to remain stationary to experience the 3D content, FTV allows an observer to view a scene from many different angles, greatly enhancing the impression of being actually present within the scene.
The FTV functionality is enabled by capturing a scene using many different cameras which observe the scene from different angles or viewpoints. These cameras generate what is known as multiview video. Multiview video can be relatively efficiently encoded by exploiting both temporal and spatial similarities that exist in different views. However, even with multiview coding (MVC), the transmission cost for multiview video remains prohibitively high. To address this, current versions of FTV only actually transmit a subset of captured multiple views, typically between 2 and 3 of the available views. To compensate for the missing information, depth or disparity maps are used to recreate the missing data. From the multiview video and depth/disparity information, virtual views can be generated at any arbitrary viewing position. Many techniques exist in the literature to achieve this, depth image-based rendering (DIBR) being one of the most prominent.
A depth map, as used in FTV, is simply a greyscale image of a scene in which each pixel indicates the distance between the corresponding pixel in a video object and the capturing camera optical centre. A disparity map is an intensity image conveying the apparent shift of a pixel which results from moving from one viewpoint to another. The link between depth and disparity can be appreciated by considering that the closer an object is to a capturing camera, the greater will be the apparent positional shift resulting from a change in viewpoint. A key advantage of depth and disparity maps is that they contain large smooth surfaces of constant grey levels, making them comparatively easy to compress for transmission using current video coding technology.
Volumetric display or light field display is another visual system providing three dimensional viewing based on a three dimensional model of an object.
Regardless of the particular technology used to create a 3D video image, rendering the image believable in order to provide a good 3D experience to an observer remains a challenging task. Most 3D technologies aim to provide as many depth cues as possible in order to assist an observer in tolerating, understanding and enjoying the 3D content. The key depth cue in stereoscopic technologies is the binocular parallax discussed above. However, this is merely one of many depth cues used in ordinary life to give rise to depth perception. The absence of additional depth cues which would normally be available to an observer's brain to process depth perception can hinder the development of a fully immersive 3D video experience.