In order to generate a 3D impression on a multi-view display device, images from different virtual view points have to be rendered. This requires either multiple input views or some 3D or depth information to be present. This depth information can be recorded, generated from multiview camera systems or generated from conventional 2D video material. For generating depth information from 2D video several types of depth cues can be applied: such as structure from motion, focus information, geometric shapes and dynamic occlusion. The resulting depth map is subsequently used in rendering a multi-view image to give the viewer a depth impression.
WO 2005/083631 discloses a method of generating a depth map comprising depth values representing distances to a viewer, for respective pixels of an image. This method comprises: computing a cost value for a first one of the pixels of the image by combining differences between values of pixels which are disposed on a path from the first one of the pixels to a second one of the pixels which belongs to a predetermined subset of the pixels of the image; and assigning a first one of the depth values corresponding to the first one of the pixels on basis of the cost value.
The cited method is based on the following observation. Objects in a scene to be imaged have different sizes, luminances, and colors, and have a certain spatial disposition. Some of the objects occlude other objects in the image. Differences between luminance and/or color values of pixels in an image are primarily related to the differences between optical characteristics of the surfaces of the objects and related to the spatial positions of objects relative to light sources within the scene. Optical characteristics of surfaces comprise for example color and reflectiveness. Hence, a relatively large transition in luminance and/or color or a relatively large difference between pixel values of neighboring pixels corresponds to a transition between a first image segment and a second image segment, whereby the first image segment corresponds to a first object and the second image segment corresponds to a second object in the scene being imaged. By determining for the pixels of the image the number of and extend of transitions in luminance and/or color, or differences between pixel values on a path from the respective pixels to a predetermined location of the image, respective measures related to the spatial disposition of the objects in the scene can be achieved. These measures, or cost values, are subsequently translated into depth values. This translation is preferably a multiplication of the cost value with a predetermined constant. Alternatively, this translation corresponds to a mapping of the respective cost values to a predetermined range of depth values by means of normalization. It should be noted that the background also forms one or more objects, e.g. the sky or a forest or a meadow.
The depth value which is based on the luminance and/or color transients can be directly used as depth value for rendering a multi-view image.