In many image processing systems it is desirable to form panoramic images from a plurality of individual images or from a sequence of video frames. To form a panoramic image, the images of the scene must be aligned with one another and merged (stitched) to form a comprehensive panoramic image of a scene with redundant information removed therefrom. A mosaic image is generally a data structure that melds information from a set of still pictures and/or frames of a video sequence (collectively, "images"), which individually observe the same physical scene at a plurality of different time instants, viewpoints, fields of view, resolutions, and the like. The various images are geometrically aligned and colorimetrically matched, then merged together to form a panoramic view of the scene as a single coherent image.
The phrase image processing, as used herein, is intended to encompass the processing of all forms of images including temporally unrelated images as well as images (frames) of a video signal, i.e., a sequence of temporally related images.
Accurate image alignment is the cornerstone of a process that creates mosaics of multiple images. Alignment (also known as registration) of images begins with determining a displacement field that represents the offset between the images and then warping one image to the other to remove or minimize the offset.
In order for the mosaic to be coherent, points in the mosaic must be in one-to-one correspondence with points in the scene. Accordingly, given a reference coordinate system on a surface to which source images will be warped and combined, it is necessary to determine the exact spatial mapping between points in the reference coordinate system and pixels of each image.
Methods for manually or automatically producing mosaics from source images are known in the art. One example of an automatic mosaic generation system is disclosed in U.S. Pat. No. 5,649,032 issued Jul. 15, 1997, which is hereby incorporated herein by reference. In this patent, temporally adjacent video frames are registered to each other, yielding a chain of image-to-image mappings which are then recursively composed to infer all the reference-to-image mappings. Alternatively, each new frame is registered to the mosaic which was recursively constructed from previous frames, yielding the desired reference-to-image mappings directly. The '032 patent describes techniques that use both frame-to-frame or frame-to-mosaic registrations to accurately align the images.
These known methods have several disadvantages. First, if any one of the frame-to-frame registrations cannot be estimated accurately, the chain is broken and subsequent frames cannot be reckoned with respect to the same reference coordinate system. Second, when the camera's field of view overlaps part of the scene which was originally observed a long time ago, these methods do not ensure that the new images will be registered with those old ones. For example, FIG. 1 depicts a time order sequence of images 101 to 108, where the images are formed by panning a camera from left to right (represented by arrow 109) for images 101 to 104 and panning a camera from right to left (represented by arrow 110) for images 105 to 108. The bottom regions of images 101 through 104 overlap the top regions of images 105 through 108. If images in the spatial configuration of FIG. 1 occur in time order starting with image 101 and continuing through image 108, and each image is registered to its predecessor image, then there is no assurance that images 101 and 108 will be in alignment when warped to the mosaic's reference coordinate system. As such, the first images (e.g., image 101) may not align properly with a latter produced image (e.g., image 108) along the overlapping portions of these images. Consequently, a panoramic mosaic produced using techniques of the prior art may be significantly distorted.
In prior art methods, not only can frames in the mosaic be misaligned, but also the overall structure of the scene may be incorrectly represented. For example, some scene parts may appear twice, or parts that should be (should not be) adjacent appear far away from (close together) one another. If the images form a large closed loop, the closure might not be represented in the mosaic. These errors occur when the complete topology of neighborhood relationships among images is not fully recognized.
A further limitation of existing mosaic generation techniques is that they estimate spatial mappings suitable only for combining images onto a cylindrical or planar reference surface, which is not a suitable representation for panoramas that subtend angles of more than about 140.degree. in both directions.
Therefore, a need exists in the art for an image processing technique that forms panoramic mosaics by determining the topology of an image sequence and globally aligning the images in accordance with the topology.