In computer vision applications, image registration establishes a common frame of reference for a set of images acquired of a scene from different views and different cameras. Typically, image registration is required for video tracking, medical imaging, remote sensing, super-resolution and data fusion.
In general, image registration methods can be direct or feature-based. Direct methods use pixel-to-pixel matching and minimize a measure of image dissimilarity to determine a parametric transformation between two images. Often, hierarchical approaches are used to improve convergence.
Feature-based methods first extract distinctive features from each image. Then, the features are matched to establish feature correspondences. The images are then warped according to a parametric transformation estimated from the correspondences. Unlike direct methods, feature-based registration does not require initialization and is able to handle large motion and viewpoint changes between the images. However, extracting distinctive features, which are invariant to illumination, scale and rotation, is difficult.
A scale invariant feature transform (SIFT) can be used to register images. SIFT is insensitive to the ordering, orientation, scale and illumination of the images.
Images to be registered can be acquired with different cameras and imaging modalities, visible and infrared (IR). Due to different characteristics of the multi-model imaging sensors, the relationship between the intensities of corresponding pixels in multi-modal images can be complex and unknown.
Conventional intensity based feature extraction fail in the case of multi-modal images. The features that appear in one image might not be present in other images. For example, an IR image of a painting appears to be homogenous because all the different colored paints have the same IR radiance.
Mutual information can be used for registering multi-modal images. Mutual information relies on the dependence of the distribution of the intensities of matched pixels, instead of the similarities of intensities of corresponding pixels.
Geometric properties, such as contours, edges, and corners can also be used to register images. Global estimation can be used to improve convergence to the common geometric properties.
Another method registers images by iteratively minimizing orientation displacements of pixels with large intensity gradients. However, most prior art methods assume that the displacement between multi-modal images is small, and that the features are highly correlated.