Conventional systems for generating images comprising a large field of view of a scene from a plurality of images generally have two steps: (1) an image capture step, where the plurality of images of a scene are captured with overlapping pixel regions; and (2) an image combining step, where the captured images are digitally processed and blended to form a composite digital image.
In some of these systems, images are captured about a common rear nodal point. For example, in U.S. Ser. No. 09/224,547, filed Dec. 31, 1998 by May et. al., overlapping images are captured by a digital camera that rotates on a tripod, thus ensuring that each image is captured with the same rear nodal point lying on the axis of rotation of the tripod.
In other systems, the capture constraint is weakened so that the images can be captured from substantially similar viewpoints. One example of a weakly-constrained system is the image mosaic construction system described in U.S. Pat. No. 6,097,854 by Szeliski et al., issued Aug. 1, 2000; also described in Shum et al., “Systems and Experiment Paper: Construction of Panoramic Image Mosaics with Global and Local Alignment,” IJCV 36(2), pp. 101–130, 2000. Another example is the “stitch assist” mode in the Canon PowerShot series of digital cameras (see http://www.powershot.com/powershot2/a20_a10/press.html; U.S. Pat. No. 6,243,103 issued Jun. 5, 2001 to Takiguchi et al.; and U.S. Pat. No. 5,138,460 issued Aug. 11, 1992 to Egawa.
In some systems, the capture constraint is removed altogether, and the images are captured at a variety of different locations. For example, the view morphing technique described in Seitz and Dyer, “View Morphing,” SIGGRAPH '96, in Computer Graphics, pp. 21–30, 1996, is capable of generating a composite image from two images of an object captured from different locations.
The digital processing required in the image combining step depends on the camera locations of the captured images. When the rear nodal point is exactly the same, the image combining step comprises three stages: (1) a warping stage, where the images are geometrically warped onto a cylinder, sphere, or any geometric surface suitable for viewing; (2) an image alignment stage, where the warped images are aligned by a process such as phase correlation (Kuglin, et al., “The Phase Correlation Image Alignment Method,” Proc. 1975 International Conference on Cybernetics and Society, 1975, pp. 163–165), or cross correlation (textbook: Gonzalez, et al., Digital Image Processing, Addison-Wesley, 1992); and (3) a blending stage, where the aligned warped images are blended together to form the composite image. The blending stage can use a simple feathering technique that uses a weighted average of the images in the overlap regions, and it can utilize a linear exposure transform (as described in U.S. Ser. No. 10/008,026, filed Nov. 5, 2001 by Cahill et al., to align the exposure values of overlapping images. In addition, a radial exposure transform (as described in U.S. Ser. No. 10/023,137, filed Dec. 17, 201 by Cahill et al., can be used in the blending stage to compensate for light falloff.
In weakly-constrained systems, the image combining step generally comprises two stages: (1) an image alignment stage, where the images are locally and/or globally aligned according to some model (such as a translational, rotational, affine, or projective model); and (2) a blending stage, where the aligned images are blended together to form a texture map or composite image. The blending stage typically incorporates a de-ghosting technique that locally warps images to minimize “ghost” images, or areas in the overlapping regions where objects are slightly misaligned due to motion parallax. The local warping used by the de-ghosting technique can also be incorporated in the model of the image alignment stage. For an example of image combining with such a system, see the aforementioned Shum and Szeliski references.
In systems where the capture constraint is removed altogether, the image combining step first requires that the epipolar geometry of the captured images be estimated (for a description of estimating epipolar geometry, see Zhang, et al., “A Robust Technique for Matching Two Uncalibrated Images Through the Recovery of the Unknown Epipolar Geometry,” INRIA Report No. 2273, May 1994, pp. 1–38). Once the epipolar geometry has been estimated, the images are projected to simulate capture onto parallel image planes. The projected images are then morphed by a standard image morphing procedure (see Beier et al., “Feature-Based Image Metamorphosis,” SIGGRAPH '92 Computer Graphics, Vol. 26, No. 2, July 1992, pp. 35–42), and the morphed image is reprojected to a chosen view point to form the composite image. An example of such a system is described in the aforementioned Seitz and Dyer reference.
In all of the prior art methods and systems for generating large field of view images, the composite image is provided as output. In some instances, however, it might be necessary to provide a composite image that has been cropped and/or zoomed to a selected aspect ratio and size. For example, consider a digital photofinishing system that prints hardcopies of images that have been digitized from film after being captured by an Advanced Photo System (APS) camera. APS cameras provide the photographer the choice of receiving prints in three different formats: HDTV (H), Classic (C), or Panoramic (P). The Classic format corresponds to a 3:2 aspect ratio, the HDTV format to a 16:9 aspect ratio, and the Panoramic format to a 3:1 aspect ratio. If the photographer captures a sequence of images with an APS camera and uses one of the known techniques to generate a composite image, the composite image will likely not have an aspect ratio corresponding to the H, C, or P formats. Since one of these three formats would be required in the digital photofinishing system, the photographer must manually intervene and crop the composite image to the appropriate aspect ratio for printing.
There is a need therefore for an improved method that will combine images into a composite image; the method being capable of automatically cropping the composite image to a desired aspect ratio.