A microfiche appendix of xe2x80x9cCxe2x80x9d source code for a preferred embodiment are filed herewith. A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The present invention relates generally to joining of fragments of an image to assemble the complete image, and particularly to accurately joining multiple image fragments.
Today, image processing devices allow images to be xe2x80x9ccapturedxe2x80x9d by computer systems by, e.g., scanning an image to obtain a digital representation of the image. Also, digital representations of images can be printed to generate a hard copy of the image. Examples of image processing devices are copiers, fax machines and scanners. These systems now use advanced technology to allow a human operator to manipulate the captured image by reducing, enlarging, adjusting the contrast, resolution or color of images, etc. While today""s basic image processing devices are well-suited to handling standard size images, such as an image on an 8.5xe2x80x3xc3x9711xe2x80x3 sheet of paper, problems arise in these devices where an oversize image needs to be broken into image fragments in order to capture the image into a device and the fragments need to be reassembled for printing or other further processing.
For example, a problem with copy machines arises when it is desired to copy an oversize image, such as a map or poster. This is because the configuration of the copy machine will usually allow only portions, or fragments, of the oversize image to be scanned in each pass of the copier""s scanning mechanism. This means that the human user of the copier needs to manually position the oversize image and make multiple scans of portions of the map or poster. Because the user must visually align the oversize image on the copier""s platen, often without the aid of any registration marks, the user ends up with a hodgepodge collection of non-uniform fragments of the oversize image spread out over the papers. In the worst case, the user must then manually assemble the image fragments by cropping and taping together the pages.
Similarly, fax machines are limited to accepting paper of fixed and relatively small dimensions. If an oversize document is wider than that allowable by the fax machine, the document must be broken up into smaller images on smaller sheets of paper. The oversize image is then transmitted as several pieces to a receiving fax machine. A user at the receiving fax machine then goes through a similar process to piece together the oversize document""s image from the multiple fragments of the document.
The process of automatically aligning image fragments to reproduce an original image is known as image registration. Some prior art image registration techniques were primarily developed for applications in the remote sensing field, e.g. constructing a composite satellite image of a large area from multiple photographs taken at different satellite positions.
These techniques, however, cannot be effectively applied to the office copier environment. On the one hand, long response times cannot be tolerated in copier applications. On the other hand, image registration techniques used in remote sensing must not only translate and rotate image fragments relative one another to align them, but must also correct for nonlinear effects, aspect, scale, changing contrast, and other effects.
Image registration techniques developed for the office environment all suffer from one or more shortcomings. One technique relies on marks that must be specially applied to the original large format document. In accordance with another technique, a large format document is sequentially scanned in segments. Pairs of image fragments are then registered to one another in sequence.
This technique cannot provide professional quality in situations where more than two overlapping fragments are to be aligned. Consider the situation in FIG. 1 where 4 overlapping image fragments 2, 4, 6, and 8 have been aligned with one another in accordance with the prior art technique. Essentially, the prior art technique uses pairwise alignment to align image fragment 2 with image fragment 4, image fragment 4 with image fragment 6, and image fragment 6 with image fragment 8. The alignments of image fragment pairs 2 and 4, 4 and 6, and 6 and 8 are acceptable. However, the alignment between image fragment 2 and image fragment 8 is unacceptable. This is because imperceptible errors in the pairwise alignment accumulate to the point that the alignment error between image fragments 2 and 8 is perceptible. Since many applications will require more than two image fragments to be joined, this type of alignment error propagation over multiple fragments represents a serious shortcoming of the prior art.
In accordance with the present invention, more than two fragments of an image may be aligned to assemble the image while providing high alignment quality between each pair of overlapping image fragments. Image registration operations are performed rapidly. The present invention finds application in, for example, scanning, copying, and facsimile transmission of large format documents.
In accordance with a first aspect of the present invention, apparatus is provided for optimally joining more than two overlapping image fragments of a complete image to recover the complete image. The apparatus includes means for measuring an alignment error among at least two overlapping ones of the more than two image fragments in accordance with a first predetermined metric, means for refining an alignment between two selected overlapping image fragments of the two or more image fragments to reduce an alignment error between the two selected overlapping image fragments, means, coupled to the measuring means for accumulating a total alignment error between every possible overlapping pair of the more than two image fragments in accordance with the predetermined metric, global optimization means, coupled to the accumulating means and the refining means, for repeatedly applying the refining means to successive pairs of the overlapping image fragments to optimize the total alignment error.
In one embodiment of the present invention, a large format document or panoramic scene is captured as individual overlapping fragments by a scanner or other image capture device. A user then applies a user interface including a display and a pointing device to approximately align the image fragments on the display. One example of this kind of image fragment manipulation is described in U.S. patent application Ser. No. 08/446,196, (which is now U.S. Pat. No. 5,732,230 issued Mar. 24, 1998) assigned to the assignee of the present application, the contents of which are herein expressly incorporated by reference for all purposes.
Once the image fragments are brought by the user into approximate alignment, automatic image registration takes over. A list of overlapping image fragment pairs is constructed. The alignment of each pair of fragments is refined in turn. Within the scope of the present invention, any technique could be used to align each pair of image fragments. While individual pairs of fragments are being aligned, a total alignment error for all the pairs is monitored. New pairwise alignments that increase total error are rejected. When the total alignment error ceases to improve, the refinement process terminates.
This optimization process provided by the present invention thus assures that improvements in the alignment of one pair of image fragments do not come at the expense of the alignment of another pair of image fragments. Thus satisfactory registration quality is assured for all pairs of fragments.
The present invention further provides efficient techniques for refining the alignment of two overlapping image fragments. Generally, such techniques involve searching over a space of possible alignments for a best match. The present invention provides several techniques that may be applied to limit the search space and thus accelerate the refinement process.
One such technique provided by the present invention is a technique for first identifying template areas or interest points in a first image fragment to limit the search space for possible refined alignments. A grid of cells is overlaid over the first image fragment. An interest operator is applied to each pixel in the image to obtain an interest level for each pixel. For each cell having a pixel whose level that exceeds a predetermined threshold, the pixel with the greatest interest level in the cell is selected as a candidate interest point. Thus each cell has either one or zero candidate interest points. From the candidates, a first interest point is selected to be the candidate with the greatest interest level. A second interest point is selected to be the candidate interest point furthest away from the first interest point. Alternatively, some other number of interest points could be selected from the candidate interest points using similar criteria or other criteria.
The present invention further provides an enhanced technique for finding the interest level of each pixel. In accordance with this enhanced metric, the variance of pixel value is determined among pixels, a radius r pixels away along a vertical or horizontal axis, then calculated for pixels 2r pixels away and 3r pixels away. Three variances are thus obtained and the means of these three variances is determined to be the Moravec""s variance for the pixel being evaluated. Of course, in accordance with the present invention, the number of variances used in the final determination could be different than three.
The present invention provides further techniques limiting the search space of possible refined alignments even after interest points have been identified in the first overlapping image fragment. For each interest point, the mean of a region surrounding each interest point in the first image fragment is evaluated. The region shape is selected so that the region""s mean is invariant over rotations relative to the first image fragment and may, for example, be a circle or an annulus. For each interest point, this mean is used to limit the search space of possible translational alignments in the second image fragment. For each pixel in the second image fragment, the mean pixel value of a similarly structured region is evaluated. If the mean pixel value differs from the mean pixel value determined for the region surrounding the interest point by more than a threshold percentage, the translational alignment of the interest point to that pixel in the second image fragment can be discarded as a possible alignment. Because of the region shape, rotational alignments need not be checked separately at this stage. This aspect of the present invention greatly enhances the efficiency of refining the alignment of two overlapping image fragments.
In the preferred embodiment, initial evaluation of possible alignments of two overlapping image fragments is done substantially in accordance with the teaching of [Barnea72], the contents of which are herein expressly incorporated by reference for all purposes. Each interest point is aligned separately. The alignment errors of possible alignments are measured in accordance with a predetermined alignment error metric. In the preferred embodiment, the so-called L1 metric is used to determine alignment error. The error is calculated on a pixel-by-pixel basis between a region surrounding the interest point and a similarly sized region in the second image fragment. Once an accumulated error for a given alignment exceeds a threshold, calculations for that alignment may be terminated since there is little or no possibility of the alignment being ultimately selected as the refined alignment. The threshold may take into consideration the errors calculated for previous alignments and how far along the current alignment error evaluation is currently.
The result of the search through possible alignments is preferably a set of lists, with a list for each interest point, of alignments for which the measured error falls below a threshold. Each alignment may be represented as the location of the pixel in the second overlapping image fragment that lines up with the interest point.
In accordance with the invention, the refined alignment may be selected by taking into consideration the geometric relationship among the interest points located in the first image fragment. For a good alignment, the pixels in the second image fragment that align to the interest points of the first image fragment must have the same geometric relationship among themselves as the interest points. For the preferred case of two interest points, the Euclidean distance between the interest points will of course correspond to the Euclidean distance between the pixels aligning to them in the second image fragment.
To identify the alignments for which this geometric relationship holds, the lists associated with each interest point are searched for groups of pixels having the same geometric relationship among themselves as the interest points. If only one group of pixels matches this criterion within a predetermined tolerance, that group of pixels is used to determine the new refined alignment. If more than one group of pixels meets this criterion within the predetermined tolerance, the centroids of closely clustered pixels are used as the basis for determining the refined alignment.