1. Field of the Invention
The present invention relates to an image matching apparatus, image matching method, and computer program.
2. Description of the Related Art
An image recognition processing method has been proposed, which executes pattern recognition of an object by extracting feature amounts from an image containing the object. This image recognition processing method is usable, for example, for person recognition based on an acquired face image. The person recognition based on a face image has received attention as a technique which allows non-contact recognition, unlike fingerprint recognition or vein recognition, and places no restrictions on the target's action.
Approaches to person recognition based on a face image are roughly classified into two schemes. The first approach is a “pattern matching method” which captures a face as an image pattern expressed by two-dimensionally arraying the grayscale values of pixels and executes matching of the pattern. The second approach is a “feature-based method” which recognizes a person by extracting feature points representing features such as eyes, mouth, and nose in a face and executing matching of feature vectors that express the shapes of the features and their special layout relationship as numerical values.
The contents of techniques representative of the approaches will be briefly described below.
A representative example of the pattern matching method is an eigenface method using principal component analysis (PCA) (U.S. Pat. No. 5,164,992). The fundamental outline of the eigenface method will be explained below. In the eigenface method, PCA is applied to the grayscale value patterns of a number of face images, thereby obtaining an orthogonal normal basis called an eigenface.
KL (Karhunen-Loeve) expansion is executed for the grayscale pattern of an arbitrary face image by using the orthogonal normal basis, thereby obtaining the dimensionally compressed vector of the pattern. Finally, this vector is defined as a feature vector for recognition. Recognition is done by statistic processing between feature vectors of an input pattern and those of a registration pattern registered in advance. The fundamental outline of the eigenface method has been described above.
A representative example of the second approach—the feature-based method—is a technique based on a dynamic link architecture (U.S. Pat. No. 6,356,659). A fundamental outline of this technique will be described below. This technique applies a Gabor filter to extract the periodicity and directivity of each grayscale feature on a number of sample points (e.g., the contours of eyes, mouth, nose, and face) set on a face pattern and uses local texture information as feature vectors.
In addition, a graph which associates each sampling point with a node is obtained. The graph is formed by using the spatial layout information of the sampling points and feature vectors serving as the attribute values of the nodes corresponding to the sampling points. A recognition process is performed by dynamically deforming the spatial layout information between the nodes and selecting a registered pattern having the highest similarity between the graph of an input pattern and the graph of registered patterns which are registered in advance.
The fundamental outline of the technique based on dynamic graph matching has been described above. Not only techniques based on these approaches but also various derivative face recognition techniques have also been developed.
The accuracy of face recognition based on a face images as one of the image matching technologies largely depends on geometric variations caused by the difference in the posture or the image sensing direction of a target object and optic variations caused by differences in illumination conditions. For the purpose of eliminating the influence of optic variations, an image matching apparatus, image matching method, and image matching program as described in Japanese Patent Laid-Open Nos. 2004-145576 and 2000-30065 have been proposed.
Japanese Patent Laid-Open No. 2004-145576 discloses a technique which acquires knowledge to remove illumination variations from a plurality of arbitrary images by learning in advance. On the basis of the knowledge about the illumination variations obtained by learning, feature amounts without the influence of illumination variations are extracted from an input image and a registered image and compared with each other. The learning process for obtaining knowledge regarding illumination variations is executed by the steps of reducing the resolution of the training image or extracting a low frequency component from the training image and constructing a subspace by generating an illumination feature vector from the low resolution image or low frequency image.
On the other hand, Japanese Patent Laid-Open No. 2000-30065 discloses, as a face recognition method robust to illumination variations, a constrained mutual subspace method which expands a mutual subspace method having high tolerance to deformations of a face pattern. The constrained mutual subspace method generates an input subspace based on a plurality of face patterns obtained from a moving image sequence and identifies, as a similarity, the angle made by the generated input subspace and registrant dictionary subspace generated in advance from a moving image sequence of the registrant. The constrained mutual subspace method projects the input subspace and registrant subspace in the mutual subspace method on a subspace without illumination variations (Kazuhiro Fukui et al, “Face Image Recognition Which Uses Mutual Constrained Subspace Method and is Robust to Environmental Variations”, IEICE Transactions D-II Vol. J82-DII, No. 4, pp. 613-620) and identifies, as a similarity, the angle made by the input subspace and registrant subspace projected on the subspace.
The subspace without illumination variations is generated in the following way. A moving image sequence is sensed for a plurality of persons under the same illumination condition. In correspondence with all combination pairs of persons, a subspace is generated from the moving image of each person. A difference subspace of this subspace is generated, and the subspace without illumination variations is generated from the difference subspace.
As described above, it is possible to acquire knowledge to remove the influence of illumination variations from a plurality of image data by learning and remove the illumination variations from an input image based on the knowledge, thereby providing face recognition robust to illumination variations.
However, to ensure high robustness to illumination variations depending on learning, a sufficient number of training samples are necessary. Additionally, the sufficient number of samples is generally unknown. Furthermore, recognition performance depends on the learning pattern. The load related to learning is heavy.