The presentation of ‘three-dimensional’ images by arranging for the viewer's left and right eyes to see different images of the same scene is well known. Such images are typically created by a ‘stereoscopic’ camera that comprises two cameras that view the scene from respective viewpoints that are horizontally spaced apart by a distance similar to that between the left and right eyes of a viewer.
This technique has been used for ‘still’ and ‘moving’ images. There is now great interest in using the electronic image acquisition, processing, storage and distribution techniques of high-definition television for stereoscopic motion-images.
Many ways of distributing stereoscopic image sequences have been proposed, one example is the use of separate image data streams or physical transport media for the left-eye and right-eye images. Another example is the ‘side-by-side’ representation of left-eye and right-eye images in a frame or raster originally intended for a single image. Other methods include dividing the pixels of an image into two, interleaved groups and allocating one group to the left-eye image and the other group to the right-eye image, for example alternate lines of pixels can be used for the two images.
To present the viewer with the correct illusion of depth, it is essential that his or her left eye sees the image from the left side viewpoint, and vice-versa. If the left-eye and right-eye images are transposed so that the left eye sees the view of the scene from the right and the right eye sees the view from the left, there is no realistic depth illusion and the viewer will feel discomfort. This is in marked contrast to the analogous case of stereophonic audio reproduction where transposition of the left and right audio channels produces a valid, equally-pleasing (but different) auditory experience.
The multiplicity of transmission formats for stereoscopic images leads to a significant probability of inadvertent transposition of the left and right images. The wholly unacceptable viewing experience that results from transposition gives rise to a need for a method of detecting, for a given ‘stereo-pair’ of images, which is the left-eye image, and which is the right-eye image. In this specification the term ‘stereo polarity’ will be used to denote the allocation of a stereo pair of images to the two image paths of a stereoscopic image processing or display system. If the stereo polarity is correct then the viewer's left and right eyes will be presented with the correct images for a valid illusion of depth.
In a stereo-pair of images depth is represented by the difference in horizontal position—the horizontal disparity—between the representation of a particular object in the two images of the pair. Objects intended to appear in the plane of the display device have no disparity; objects behind the display plane are moved to the left in the left image, and moved to the right in the right image; and, objects in front of the display plane are moved to the right in the left image, and moved to the left in the right image.
If it were known that all (or a majority of) portrayed objects were intended to be portrayed behind the display plane, then measurement of disparity would enable the left-eye and right-eye images to be identified: in the left-eye image objects would be further to the left than in the right-eye image; and, in the right-eye image objects would be further to the right than in the left-eye image.
However, it is common for objects to be portrayed either in front of or behind the display plane; and, a constant value may be added to, or subtracted from the disparity for a pair of images as part of the process of creating a stereoscopic motion-image sequence. For these reasons a simple measurement of horizontal disparity cannot be relied upon to identify left-eye and right-eye images of a stereo pair.
Attempts have been made to overcome this problem by making statistical assumptions about image portrayal, specifically that an object appearing lower in an image is assumed to be to the front of an object appearing higher in the image. Reference is directed in this context to U.S. Pat. No. 6,268,881 and US 2010/0060720. It will be understood that in many image pairs, such an assumption cannot be relied upon. For robust detection, it is desirable to reduce the reliance placed upon statistical assumptions.