Over the last few years various companies have been actively developing autostereoscopic displays suitable for rendering three-dimensional (3D) images. These devices can present viewers with a 3D impression without the need for special headgear and/or glasses.
Autostereoscopic displays generally generate different views for different viewing angles. In this manner a first image can be generated for the left eye and a second image for the right eye of a viewer. By displaying appropriate images, i.e. appropriate from the viewpoint of the left and right eye respectively, it is possible to display the respective images and convey a 3D impression to the viewer.
A variety of techniques can be used to generate images for such autostereoscopic displays. For example multi-view images can be recorded using multiple cameras wherein the position of the respective camera corresponds with the respective viewpoint of each respective view. Alternatively, individual images can be generated using a 3D computer model.
In order to maintain backwards compatibility and improve on bandwidth usage, many autostereoscopic displays use an input sequence in the form of a sequence of conventional 2D images and corresponding depth-maps.
Depth-maps provide depth information indicative of the absolute or relative distance of objects depicted in the image to the camera. By way of example, a common way of representing depth information is by means of an 8-bit grey-scale image. Depth-maps can provide depth-information on a per-pixel basis, but as will be clear to the skilled person may also use a coarser granularity, such as a lower resolution depth-map wherein each depth-map value provides depth-information for multiple pixels.
Disparity maps can be used as an alternative to the above mentioned depth-maps. Disparity refers to the apparent shift of objects in a scene when observed from two viewpoints, such as from the left-eye and the right-eye viewpoint. Disparity information and depth information are related and can be mapped onto one another as is commonly known to those skilled in the art.
In view of the above, the terms depth-related information and depth values are used throughout the description and are understood to comprise at least depth information as well as disparity information.
By providing an autostereoscopic display with an image sequence and a corresponding sequence of depth-related information maps, or depth-maps for short, the autostereoscopic display can render multiple views of the content for one or more viewers. Although newly created content might be provided with accurately recorded depth values, more conventional two-dimensional (2D) image sequences do not comprise the required depth values.
Various approaches to convert 2D to 3D content are known, some of which address real-time conversion without human intervention, whereas others address human-assisted 2D to 3D conversion. In the latter approach an operator generally defines depth information for selected key frames and this depth information is subsequently propagated to the non-key frames. Similar approaches may be used to propagate depth values when depth-values are available only for a subset of images in the image sequence.
A known approach is presented in International Patent Application WO2002/13141. According to this approach a network is trained using annotated depth values for a subset of pixels from a key-frame. This information is used to learn the relationship between texture information and depth characteristics. The trained network is subsequently used to generate depth information for the entire key-frames. During a second phase the depth-maps of the key-frames are used to generate depth-maps for non key-frames from image characteristics and relative distance to key frame(s).