A. Image Registration
Image registration is the process of aligning a pattern image over a reference image so that pixels present in both images are disposed in the same location. This process is useful, for example, in the alignment of an acquired image over a template, a time series of images of the same scene, or the separate bands of a composite image ("coregistration"). Two practical applications of this process are the alignment of radiology images in medical imaging and the alignment of satellite images for environmental study.
In typical image registration problems, the reference image and the pattern image are known or are expected to be related to each other in some way. That is, the reference image and the pattern image are known or are expected to have some elements in common, or to be related to the same subject or scene. In these typical image registration problems, the sources of differences between the two images can be segregated into four categories:
1. Differences of alignment: Differences of alignment between images are caused by a spatial mapping from one image to the other. Typical mappings involve translation, rotation, warping, and scaling. For infinite continuous domain images, these differences are a result of a spatial mapping from one image to the other. Changing the orientation or parameters of an imaging sensor, for example, can cause differences of alignment.
2. Differences due to occlusion: Differences due to occlusion occur when part of a finite image moves out of the image frame or new data enters the image frame of a finite image due to an alignment difference or when an obstruction comes between the imaging sensor and the object being imaged. For example, in satellite images, clouds frequently occlude the earth and cause occlusions in images of the earth's surface.
3. Differences due to noise: Differences due to noise may be caused by sampling error and background noise in the image sensor, and from unidentifiably invalid data introduced by image sensor error.
4. Differences due to change: Differences to change are actual differences between the objects or scenes being imaged. In satellite images, lighting, erosion, construction, and deforestation are examples of differences due to change. In some cases, it may be impossible to distinguish between differences due to change and differences due to noise.
Images are typically registered in order to detect the changes in a particular scene. Accordingly, successful registration detects and undoes or accounts for differences due to alignment, occlusion, and noise while preserving differences due to change. Registration methods must assume that change is small with respect to the content of the image; that is, the images being registered are assumed to be "visibly similar" after accounting for differences due to alignment, occlusion, and noise. In addition, a sufficient amount of the object or scene must be visible in both images. For example, it may be assumed that at least 50% of the content of the reference image is also present in the pattern image to be registered against it. In practice, medical and satellite sensors can usually be oriented with enough precision for images to share 90% or more of their content.
B. Rotation-Scale-Translation Transformations
The present invention provides an efficient method for registering two images that differ from each other by a Rotation-Scale-Translation ("RST") transformation in the presence of noise and occlusion from alignment. The RST transformation is expressed as a combination of three transformation parameters: a single translation vector, a single rotation factor, and a single scale factor, all operating in the plane of the image. The invention recovers these three parameters from the two images, thereby permitting the pattern image to be registered with the reference image by "undoing" the rotation, scale and translation of the pattern image with respect to the reference image. The invention also encompasses novel methods for recovering the rotation or scale factors alone, which may be useful where it is known that the alignment between the reference and pattern images is affected by only one of these factors.
The RST transformation is expressed as a pixel-mapping function, M, that maps a reference image r into a pattern image p. In practice, these functions operate on finite images and can only account for data that does not leave or enter the frame of the image during transformation. If a two-dimensional infinite continuous reference image r and pattern image p are related by an RST transformation such that p=M(r), then each point r(x.sub.r, y.sub.r) in r maps to a corresponding point p(x.sub.p, y.sub.p) according to the matrix equation: ##EQU1##
Equivalently, for any pixel p(x, y), it is true that: EQU r(x, y)=p(.DELTA.x+s.multidot.(x cos .phi.-y sin .phi.),.DELTA.y+s.multidot.(x sin .phi.+y cos .phi.)) (2)
In this notation, .phi., s, and (.DELTA.x, .DELTA.y) are the rotation, scale and translation parameters, respectively, of the transformation, where .phi. is the angle of rotation in a counter clockwise direction, s is the scale factor, and (.DELTA.x, .DELTA.y) is the translation. For finite discrete r and p, assume r and p are square with pixel area N (size N.times.N). Note that an RST transformation of a finite image introduces differences due to occlusions as some data moves into or out of the image frame.
C. The Fourier-Mellin Invariant
The Fourier transform has certain properties under RST transformations that make it useful for registration problems. Let two two-dimensional infinite continuous images r and p obey the relationship given in Equation (2) above. By the Fourier shift, scale, and rotation theorems, the relationship between F.sub.r and F.sub.p, the Fourier transforms of r and p, respectively, is given by: EQU F.sub.r (.omega..sub.x,.omega..sub.y)=e.sup.j2.pi.(.omega..sup..sub.x .sup..DELTA.x+.omega..sup..sub.y .sup..DELTA..sup..sub.y .sup.)/s s.sup.2 F.sub.p ((.omega..sub.x cos .phi.+.omega..sub.y sin .phi.)/s,(-.omega..sub.x sin .phi.+.omega..sub.y cos .phi.)/s) (3)
Note that the complex magnitude of Fourier transform F.sub.p is s.sup.2 times the magnitude of F.sub.r and it is independent of .DELTA.x and .DELTA.y. Also, the magnitude of F.sub.p is derived from the magnitude of F.sub.r by rotating F.sub.r by -.phi. and shrinking its extent by a factor of s. This enables us to recover the parameters of rotation and scale through separate operations on the magnitude of F.sub.p.
Equation (3) shows that rotating an image in the pixel domain by angle .phi. is equivalent to rotating the magnitude of its Fourier transform by .phi.. Expanding an image in the pixel domain by a scale factor of s is equivalent to shrinking the extent of the magnitude of its Fourier transform by s and multiplying the height (amplitude) of the magnitude of the Fourier transform by S.sup.2. Translation in the pixel domain has no effect on the magnitude of the Fourier transform. Because of this invariance, the magnitude of a Fourier transform is referred to as the "Fourier-Mellin invariant," and the Fourier-magnitude space is referred to as the "Fourier-Mellin domain." The Fourier-Mellin transforms, R and P, of r and p, respectively, are R=.vertline.F.sub.r.vertline. and P=.vertline.F.sub.p.vertline..
Many prior art registration techniques operate on the translation-invariant Fourier-Mellin space, then convert to polar-logarithmic ("polar-log") coordinates so that rotation and scale effects appear as translational shifts along orthogonal .theta. and log.sub.B .rho. axes, where B is global constant logarithm base. See B. Reddy et al., "An FFT-Based Technique For Translation, Rotation, And Scale Invariant Image Registration," IEEE Transactions on Image Processing, Vol. 5, No. 8, pp. 1266-1271 (August 1996); D. Lee et al., "Analysis Of Sequential Complex Images, Using Feature Extraction And Two-Dimensional Cepstrum Techniques," Journal of the Optical Society of America, Vol. 6, No. 6, pp. 863-870 (June 1989); E. DeCastro et al., "Registration Of Translated And Rotated Images Using Finite Fourier Transforms," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-9, No. 5, pp. 700-703 (1987); S. Alliney, "Digital Analysis of Rotated Images," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 15, No. 5, pp. 499-504 (May 1993); Q. -S. Chen et al., "Symmetric Phase-Only Matched Filtering Of Fourier-Mellin Transforms For Image Registration And Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 16, No. 12, pp. 1156-1168 (December 1994). In polar-log space, the normalized correlation coefficient of R and P as a function of shift along these axes is maximized at coordinate (-.phi., -s). The one-dimensional normalized correlation coefficient at shift j is given by: ##EQU2##
This extends simply to two dimensional polar-log space.
Equation (3) holds for infinite images but not for finite images. If it were true for finite images, it would cost O(N log N) operations to obtain the Fourier-Mellin polar-log coefficients, and O(N log N) operations to calculate the normalized correlation coefficient (by the Convolution theorem) for all cyclic shifts of the coefficients. Rotation and scale could thus be detected in O(N log N) time. Using discrete images instead of continuous ones causes some sampling error between the two images and in the calculation of the polar-log representation.
In practice, using high-resolution images and inter-pixel interpolation can minimize these errors. Unfortunately, the theory does not hold for finite images for two reasons:
1. Occlusion error: Rotating, scaling, or translating a finite image causes some of the pixel data to move out of the image frame or some new pixel data to enter the frame.
2. Tiling error: The FFT of a finite image is taken by tiling the image infinitely in the image plane. Rotation and scale do not commute with tiling.
If an image depicts a feature against a uniform and sufficiently large background, as in FIG. 1, row A, only uniform background pixels enter and leave the image frame during transformation, so no data are lost. This is the case for some medical imagining tasks such as MRI, where the images under examination depict cross-sections of anatomy with a uniform background outside the anatomy. For images with nonuniform backgrounds or insufficient padding such as FIG. 2, row A, transformations introduce occlusion error and the correlation peak may shift to a different location or suffer significant degradation.
As noted by H. Stone et al. in "A Note On Translation, Rotation, And Scale-Invariant Image Registration," NEC Research Institute Technical Report, No. 97-115R (1997) for rotation and scale, and as also noted by Alliney et al. in "Digital Image Registration Using Projections," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-8, No. 2, pp. 222-233 (March 1986) for translation, the tiling error is unavoidable when taking the FFT of a tiled image, except for rotations that are an integer multiple of 90 degrees and for small translations of padded images. The Fourier transform of a discrete finite image contains a border between tiles that manifests itself in Fourier-Mellin space as a high intensity "+" shape. (See FIG. 2, row B.) This artifact is more significant than the coefficients from the remainder of the image content. Certain prior art registration methods utilize a rotationally symmetric image frame to avoid seeing this artifact in the Fourier-Mellin space. See E. DeCastro et al., supra. The present invention offers a more effective approach as will be described below and confirmed by the results of an experiment comparing both methods to the traditional approach of using an unprocessed square image.
Despite all of the sources of error, the infinite and finite cases are related closely enough for Fourier-Mellin techniques to work successfully on finite images. However, techniques reported in the literature have low peak correlations and low signal-to-noise ratio in the correlation function. See B. Reddy et al., supra; L. Brown, "A Survey Of Image Registration Techniques," ACM Computing Surveys, Vol. 24, No. 4, pp. 325-376 (1992).
In contrast, the present invention provides a method that achieves near unity peak correlations and a high signal-to-noise ratio in the correlation function, which together greatly improve the accuracy of registration.