Field of the Invention
The present invention relates to object discriminating apparatus and method which are particularly suitable to discriminate an object in which a variation might occur.
Description of the Related Art
Various techniques of discriminating to which previously registered categories an object represented by an input image belongs, by comparing the input image with a previously registered image, have been proposed. As a concrete example of the object discriminating technique, there is person authentication which discriminates a person by using an individual inherent feature such as a face, a fingerprint or the like. Here, the person authentication using the face is called face authentication, and the person authentication using the fingerprint is called fingerprint authentication.
A category in the person authentication corresponds to a name, an ID and the like by which an individual can be identified. To discriminate an object such as a person or the like, it is necessary to previously register, as a registration image, an image of the object intended to be discriminated, together with its name and ID. Namely, by previously registering the registration image, it is possible to actually bring the discrimination into effect. If an image of the object to be discriminated (hereinafter, called an input image) is input, the input image is collated with each of the previously registered registration images. Then, if the matched registration image is found, the category of the object corresponding to the matched registration image is output as a discrimination result. On the other hand, if the matched registration image is not found, a result indicating that there is no appropriate object is output. Hereinafter, discrimination of the category of the object means decision of an individual difference of the object (e.g., a difference of person).
As a simplest technical method of discriminating a person from a face image, there is a method of obtaining the pixels of the face image itself as feature quantities, and directly comparing the obtained feature quantities with others. However, in this method, when a variation state of face direction, look, lighting or the like is different between two faces, there is a case where the difference of the pixel values due to the difference of the variation states becomes stronger or larger than the difference of the feature quantities due to the difference of persons. In other words, a phenomenon that similarity between different persons in the same variation state is higher than similarity between the same person in the different variation states might occur. Under the circumstances, various techniques such as a technique of performing comparison based on the feature quantities from which the differences of variation states have been eliminated have been proposed (e.g., “Face Recognition with Local Binary Patterns” T. Ahonen and A. Hadid, M. Pietikainen, 1994”). However, the above problem cannot be still solved sufficiently.
Besides, as another approach for solving the above problem, a technique of normalizing similarity according to a variation factor has been proposed (e.g., Japanese Patent Application Laid-Open No. 2007-140823; and “Multi-Subregion Based Probabilistic Approach Toward Pose Invariant Face Recognition” T. Kanade and A. Yamada, 2003). In this method, the magnitude of similarity due to the difference of variation factor is normalized using a conversion model of similarity previously obtained for each variation factor, thereby aiming to avoid the phenomenon that the similarity between the different persons in the same variation state is higher than the similarity between the same person in the different variation states.
To accurately discriminate the category of the object irrespective of the variation factor, it is thought that the combination of the variation states of both the images intended to be collated is obtained, and the conversion model of similarity corresponding to the obtained combination of the variation states is selected. However, for example, if the direction in which the object is photographed is used as the variation factor, the conversion models as many as the combinations of the object directions are necessary. Moreover, the number of the combinations of the variation states becomes larger at an accelerated pace as the direction in which the object can take becomes wider, and thus the number of the conversion models increases. Moreover, for example, if the direction of a light source irradiating the object is considered in addition to the object direction itself, the conversion models as many as the combinations obtained by further adding together the combinations of the object directions and the combinations of the light source directions are necessary. As a result, the increase in the number of the conversion models excessively reduces the capacity of a recording apparatus for storing the conversion models.
The present invention aims to be able to accurately discriminate the object irrespective of variations.