The meta-data is typically the data describing or representing the meaning of the data, and in the case of the face recognition, it mainly implies the data with regard to the face data of the static image, the dynamic image and the like.
As the standardizing activity of the meta-data for a multi-media content such as a picture, an image, a video, a voice and the like, the activity of MPEG-7 (an international standard for multimedia content description interface standardized by Moving Picture Experts Group) is widely known. As a descriptor of the meta-data with regard to the face recognition in it, a face recognition descriptor is proposed (“MPEG-7 Visual part of experimental Model Version 9.0,” A. Yamada et al., ISO/IEC JTC1/SC29/WG11 N3914, 2001).
In this face recognition descriptor, for the face image that is clipped and normalized, a kind of a subspace method typically referred to as an eigen-face is used to determine a basis matrix to extract a feature value of the face image. On the basis of this basis matrix, the face feature is extracted from the image. This is treated as the meta-data. As the similarity to this face feature, it is proposed to use a weighting absolute value distance.
Also, it is known that there are various methods in the technique with regard to the face recognition. For example, a method through the eigen-face based on the principal component analysis or a discriminant analysis and the like are known. The principal component analysis is known, for example, from “Probabilistic Visual Learning for Object Representation”, Moghaddam et al., (IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 19, No. 7, pp. 696-710, 1997). Also, the discriminant analysis is known, for example, from “Discriminant Analysis of Principal Components for Face Recognition”, W. Zhao et al., (Proceedings of the IEEE Third International Conference on Automatic Face and Gesture Recognition, pp. 336-341, 1998).
Also, a method is known for measuring adaptively a distance between patterns by introducing a quality index, when the subspace method is applied to the feature vector obtained from a fingerprint image. For example, there is “Fingerprint Preselection Using Eigenfeatures”, T. Kamei and M. Mizoguchi (Proceedings of the 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 918-923, 1998, Japanese Laid Open Patent Application JP-Heisei 10-177650).
However, the above-mentioned conventional technique can not obtain the sufficient precision of the face recognition.
In relation to the above-mentioned description, a pattern recognizing apparatus that uses a feature selection through a projection of a feature vector to a partial eigen-space is disclosed in Japanese Laid Open Patent Application JP-A-Heisei 10-55412. In order to verify a number of kinds of character patterns, this a conventional example of the pattern recognizing apparatus uses a method of a feature selection, reduces the dimension number of the feature vectors, thereby trying to make a speed of a recognizing process higher, and also uses the feature vector representative of a feature of an input pattern, thereby recognizing the input pattern. An input feature vector extracting unit extracts an input feature vector representative of the feature of the input pattern. A orthonormal basis memory stores a orthonormal basis of the partial eigen-space of an original feature space. A recognition dictionary stores each dictionary selection feature vector defined on the partial eigen-space correspondingly to one or more each recognition target patterns. A feature selecting unit uses the orthonormal basis stored in the orthonormal basis memory, and calculates an input selection feature vector that is the projection to the partial eigen-space of the input feature vector extracted by the input feature vector extracting section. A checking unit checks the input selection feature vector calculated by the feature selecting unit and each dictionary selection feature vector stored in the recognition dictionary, thereby recognizing the kind of the input pattern corresponding to the input selection feature vector.
Also, an object detecting apparatus is disclosed in Japanese Laid Open Patent Application JP-A-Heisei 11-306325. This conventional example of the object detecting apparatus is relatively simple in process and is aimed to accurately detect a verification object. An image input unit inputs an image, and a memory stores a region model in which a plurality of judgment element obtainment regions are set correspondingly to a feature region of a verification object to be detected. A position designating unit sequentially designates a check local region position where the region model stored in the memory is applied to the input image inputted from the image input unit or the image which is inputted in advance from the image input unit and on which an image process is performed. A judgment element obtaining unit obtains a judgment element from each judgment element obtainment region of this region model, each time the region model is sequentially applied to the position designated by this position designating unit. A Mahalanobis distance judging unit carries out a Mahalanobis distance calculation on the basis of the judgment element of each judgment element obtainment region obtained by this judgment element obtaining unit, and carries out the judgment as to whether or not the image of the check local region is the verification object image. Consequently, the detection of the verification object is done on the basis of the judged result by the judging unit.
Also, a face verifying and collating method is disclosed in Japanese Laid Open Patent Application JP-A 2000-132675. This conventional example of the face verifying and collating method is aimed to carry out a stable verification even if two face images to be compared are photographed under different photographing conditions or in different photographing times. In that method, learning is carried out in advance for each class into which the feature of the image variation caused by the difference between the photographing conditions or the photographing times is classified. The class is selected from the difference between the two face images in which at least one of the photographing condition and the photographing time is different, and the feat amounts in which the feat amounts of the image variations in the class selected from the two face images are small are respectively determined, and the face verification and collation are carried out on the basis of the feature values of the two face images. As for the feature of the image variation, a plurality of sample sets of the difference images between the two images in which the photographing conditions or the photographing times are different are prepared, and the principal component analysis is carried out for each class, thereby determining a magnitude of a variance in a sample distribution in each principal component direction and a principal component. In the selection of the cluster of the features of the image variations, the distance between the image of the difference between the input two face images and the space defined by the principal components of the respective classes is calculated, thereby selecting the class in which the calculated distance is the shortest.
Also, an image processing apparatus is disclosed in Japanese Laid Open Patent Application JP-A 2000-187733. This conventional image processing apparatus is aimed so as not to prepare a face oriented toward a left or right direction, or an oblique face or the like, as the sample for the learning. In the image processing apparatus, an image group generating unit generates a symmetrical second reference image group from a first reference image group that is symmetrically placed. A feature information extracting unit extracts feature information by using both of the first reference image group and the second reference image group. A judging unit compares the feature information extracted by the feature information extracting unit with an input image, and judges whether or not the input image is composed of the image of the same pattern as the first reference image group. The first reference image group may be a face image of a human.