In the field of computer imaging, it is often desirable to be able to produce a view of a scene from a viewpoint other than those already available. An application of such a technique is in content generation for stereoscopic displays. In a stereoscopic display, two slightly different images of the same scene are supplied to a user, one image being supplied to each eye, the perception of the user being that of a three-dimensional scene. It may be the case that the depth range provided by the two images is too great for comfortable viewing on a given three-dimensional display, in which case it is desirable to vary the depth range in the stereogram. No explicit three-dimensional information about the scene may be available, and a method of providing a new image of the scene on the basis of existing images only would allow the creation of a new stereogram which would be more comfortable for the user to view.
Several existing view synthesis techniques are known. Chen and Williams, View Interpolation for Image Synthesis, Proc, ACM SIGGRAPH 1993, pages 279-288 and Chen and Williams, Three-Dimensional Image Synthesis using View Interpolation, U.S. Pat. No. 5,613,048 describe how a three-dimensional scene can be represented by a collection of two dimensional images. It is assumed that a number of “correspondence maps” are available. These maps define the position in each two-dimensional image of a given feature as the viewpoint changes. It is suggested that the maps could be obtained by manually labelling features in the source images or by reference to a geometric model of the scene. The changes in position of a given feature define the binocular disparities of the points in the scene.
To produce a novel view of the scene, pixels from the source images are “forward mapped” into a new image in accordance with the correspondence maps. If multiple features are mapped to the same target position in the novel view, the feature that is “in front” is chosen. This technique can reduce the amount of storage space and rendering time required without necessarily compromising the quality of the novel view. There is, however, no guarantee that the novel view will be complete: there may be regions in the novel view which were not populated with colour values in the forward mapping process. It is suggested that the novel view should initially be filled with a distinctive colour which is subsequently overwritten by the forward-mapped pixel values. Any gaps in the novel view can therefore be easily identified. The gaps are then filled by interpolation from the surrounding pixel colours or interpolation of the surrounding correspondence map information with reference back to the source images. Neither of these techniques produces an accurate representation. Interpolation of the surrounding pixel colours can only produce a smooth transition between the nearby pixels, resulting in a poor approximation to any image structure that ought to appear in the missing region. The novel view will therefore appear blurred, and this problem will increase with the size of the region from which information is missing. Simple interpolation of the correspondence map data in the neighbourhood of the gap is not usually appropriate, as it does not account for depth-discontinuities.
A theoretical analysis of view interpolation is given in Seitz and Dyer, View Morphing, Proc, ACM SIGGRAPH 1996, pages 21-30, which improves upon the method of Chen and Williams as applied to real images. Again, correspondence maps are obtained defining the image position of a given feature. A stereo matching algorithm can be applied to pairs of images to generate the correspondence map. To render a novel view, Seitz and Dyer use the same forward mapping procedure as Chen and Williams. Colour information from both source images is blended to produce pixel values in the novel view. Gaps in the novel view are filled by interpolating between the nearby pixels.
The method of Seitz and Dyer is illustrated in the flow diagrams of FIG. 1a and FIG. 1b. FIG. 1a illustrates the geometric processing which is undertaken. Positions (i, jL) and (i, jR) of points in the original images IL and IR are matched by matching algorithm M to produce disparity maps DL and DR. The matches are obtained by searching for similar image features so that IL (i, jL) and IR (i, jR) are approximately equal since they represent the same feature of the scene. Alternatively, this may be considered as a position (i, jL) in IL having a corresponding point in IR at (i, jR)=(i, jL−DL (i, jL)). Likewise, the point (i, jR) in IR has a corresponding point in IL at (i, jL)=(i, jR+DR (i, jR)). The row index i is the same in the two images since the images are assumed to be rectified, that is, common features are expected to be related by horizontal translations only.
FIG. 1b illustrates the colour processing of the image. Colours from the original images IL and IR are forward mapped into the coordinates jL′ and jR′ of two novel images IL′ and IR′. In each case, the final colour is obtained by a blending operation B. The blending operation produces a colour value in the novel images IL′ and IR′ for each of the corresponding image features.
FIG. 2 illustrates the positions of the cameras associated with the original images IL and IR. The positions of the novel views IL′ and IR′ between the positions of the two original images may be defined using a parameter t whose value varies between 0 and 1 between the positions of the original images. For the synthesis of stereoscopic image pairs, two such parameters are required: tL and tR for novel views IL′ and IR′, respectively.
The process illustrated in FIG. 1b can be understood by following the route of a particular pixel. For example, starting from the top left box of FIG. 1b, a colour from image IL at position jL is selected. The disparity map DL is used to map this colour to two new positions. The first new position lies in image IL′ at position jL′=jL−tL DL (jL). The second new position is in image IR′ at position jR′=jL−tR DL (jL). Similarly, a colour from image IR at position jR is mapped to positions jL′=jR+(1−tL) DR (jR) in IL′ and to jR′=jR+(1−tR) DR (jR) in image IR′ using disparity map DR. This gives a pair of colours at each position in IL′ and IR′ which are blended as indicated.
U.S. Pat. No. 6,215,496 describes a modified “sprite” representation in which ordinary sprites can be used to render relatively flat or distant parts of a scene. Only one input image is used, taken in conjunction with an associated depth map. The depth information is forward mapped to the output position and the colour is reverse mapped to the single input image. This representation can be improved by allowing each pixel to be independently offset from the sprite mapping. The offsets are determined by the scene depth of the points in relation to the surface that is being warped. It is assumed that the offsets can be obtained from an existing model of the surface, or computed by a correspondence algorithm. The method of U.S. Pat. No. 6,215,496 is illustrated in FIGS. 3a and 3b which show the geometric and colour processing, respectively. The original images, IL and IR are matched as above. The disparities are then forward mapped giving new disparity maps DL′ and DR′ in the coordinates of the new images. Specifically, a disparity DL (jL) is mapped to position jL′=jL−tL DL (jL) in new disparity map DL′ and similarly a disparity DR (jR) is mapped to position jR′=jR+(1−tR) DR (jR) in new disparity map DR′.
The colour processing shown in FIG. 3b starts from a position in one of the novel views, using the corresponding disparity map to look up a colour in one of the source images. This colour is then rendered in the novel view. For example, position jL′ in image IL′ corresponds to disparity DL′(jL′). This means that jL′ can be mapped to position jL=jL′+tL DL′(jL′) in image IL. The colour IL (jL) can then be rendered to position jL′ in image IL′.
The method of Chen and Williams is designed to work with computer-generated images. The method of Seitz and Dyer uses forward mapping. The method of the above-mentioned US patent is designed to produce novel views of flat or distant parts of a scene based on a single source image. It is desired to be able to provide a method of generating novel views of a scene. Particularly, it is desired to provide a method which can be used to vary the depth range in a stereogram when no explicit three-dimensional information about the scene is available.
U.S. Pat. No. 5,917,937 discloses a stereo matching algorithm for determining disparities, colours and opacities from multiple input images. This algorithm generates a general disparity space and makes estimates of colours and opacities in the disparity space by forward mapping the source images into the disparity space. Statistical methods are used to refine the estimates.