Calculation of small regions in correspondence relationship between a plurality of images is a significant issue for various image processing applications such as object recognition, 3D information reconstruction, and image searching. An image recognition means configured to extract local regions in images in a normalized state invariant to affine transformation and rotation transformation (which will be referred to hereinafter as affine-invariant regions) and to use correspondence relationship between the affine-invariant regions has the advantage that a change of a viewpoint relative to a recognition object can be geometrically modeled. Since it utilizes the local affine-invariant regions, it also has the advantage of high adaptability for partial hiding of the recognition object.
[Non-patent Document 1] W. M. Wells, P. Viola, H. Atsumi, S. Nakajima, and R. Kikins, “Multi-Modal Volume Registration Maximization of Mutual Information” Medical Image Analysis, 1996
[Non-patent Document 2] D. G. Lowe “Distinctive image features from scale-invariant keypoints” Int. J. Compt. Vision, 60(2): 91-110, 2004
[Non-patent Document 3] J. Mates, O. Chum, M. Urban, and T. Pajdla “Robust Wide Baseline Stereo from Extremal Regions” BMVC02, 2002
These techniques are generally implemented by the following three-step processing (cf. FIG. 6). (1) To extract affine-invariant regions from one or more model images and a search object image (sample image). (2) To calculate correspondences of the extracted affine-invariant regions on the basis of local information. (3) To examine the correspondences calculated in the above step (2), using global information.