The present invention relates to combining digital images.
Digital images typically comprise two-dimensional arrays of picture elements (pixels) and may be, for example, digitized photographs or computer-generated images. Many applications exist for combining digital images, including applications for determining camera motion between video frames to stabilize video images, for relating or recognizing the content of different images, for aligning images for mosaicing, for high-resolution enhancement, and for building detailed models for virtual reality applications. Further discussion of various applications are found in articles such as S. Mann & R. W. Picard, Video Orbits of the Projective Group: A New Perspective on Image Mosaicing, M.I.T. Media Laboratory Perceptual Computing Section Technical Report No. 338 (1995) and Richard Szeliski, Image Mosaicing for Tele-Reality Applications, Cambridge Research Laboratory Technical Report Series (CRL 94/2) (1994), both of which are incorporated by reference.
Combining images typically requires "registering" pairs of images, which matches two or more images containing overlapping scenes and describes the correspondence of one to another. The correspondence enables the images to be combined by mapping the image data into a common image space using any of a variety of transforms, such as affine and projective transforms. As described in the Mann & Picard article, affine methods are simpler and are acceptable approximations when the correspondence between pictures is high or the images have a small field of view, or the content of the image is planar. Projective transform methods are more complex but can produce results that are mathematically more accurate for images acquired from a fixed camera location. Existing projective transform methods typically register a first image with a second by determining transform parameters corresponding to a two-dimensional projective transformation: ##EQU1##
where (u,v) are the coordinates in an image space of a pixel of the first image and (u', v') are the coordinates of the pixel mapped into an image space of the second image. This transform has eight parameters, or degrees of freedom (m.sub.0, . . . ,m.sub.7). Solving for the eight degrees of freedom typically requires a non-linear approach, which can be computationally expensive and is not guaranteed to produce a correct result.