Automated feature-based image matching is a useful tool in many computer-implemented object/scene recognition applications from robotic vision to facial recognition, among many others. A number of feature-based image matching algorithms have been developed over the past 20 years. Many of these algorithms, such as the scale-invariant feature transformation (SIFT) algorithm, deal well with rotation and scaling as between a reference image and a query image. However, most of these algorithms are not robust enough to deal with full affine movement, for example when there is large movement of the camera and/or objects in the scene, and others that attempt to handle full affine movement are computationally very expensive and are, therefore, not practical for commercial and other real-world applications.
Several algorithms have been proposed that deal with the full affine movement and try to achieve robustness to affine movement by normalizing local patches, or regions, that have undergone an unknown affine distortion. Normalization transforms each of these regions into a standard form, where the effect of the affine transform has been eliminated. The best examples of such algorithms are the Harris-Affine and Hessian-Affine region detectors, and the “maximally stable extremal region” (MSER) algorithm. MSER, in particular, has been demonstrated to often have better performance than other affine invariant detectors (when a strong change of scale is present, however, SIFT still exhibits better performance than most other methods). It is important to note that none of these normalization algorithms are truly affine invariant because they start with initial feature scales and locations that are selected in a non-affine-invariant manner. In other words, even though these algorithms claim robustness or invariance to the affine model, their feature detection step is only invariant to the scale-plus-rotation model and thus they are not truly affine-invariant.
A very recent effort, the “affine SIFT” (ASIFT) algorithm, tries to achieve true affine invariance by searching the full affine space on a lower resolution version of the images. The best estimates for the affine movement are then tested on the full resolution images. In theory this algorithm is affine-invariant, but in practice it works by running the SIFT algorithm multiple times, which makes it slower and diminishes its applicability.