Conventionally, an image collation device for comparing and collating an input two-dimensional image with a previously recorded two-dimensional image has been practically used. In particular, various image collation devices for realizing a face authentication method that is one of the authentication methods using biometrics have been proposed. In such image collation devices, face images of a plurality of persons who can be authenticated (hereinafter, referred to as “registered persons”) are previously registered in a database as registered images. A face image of a person who is to be provided with authentication (hereinafter, referred to as “a person to be authenticated”) and a registered image are compared and collated with each other. As a result, when it is determined that the face image of the person to be authenticated matches with or resembles a registered image of a certain registered person, the person to be authenticated is authenticated as the certain registered person.
In such an image collation device, there is a conventional problem that the authentication rate may be reduced due to differences between photographing conditions when a face image of a person to be authenticated is taken and photographing conditions when a registered image is taken. For example, in a case where an illumination condition such as a direction in which light is illuminated to the person to be authenticated when the face image of the person to be authenticated is taken (hereinafter, this direction is referred to as an “illumination direction”) is different from an illumination condition such as the illumination direction when the registered image is taken, even when the same object is taken, it could be determined that they do not match each other as a result of the comparison and collation.
In order to solve these problems, various techniques have been proposed recently. For example, as to each of the registered persons, from one registered image, an illumination condition when the image is taken, a face shape (normal vector), a reflectance and the like are estimated. By using these conditions, a plurality of images under different illumination conditions (referred to as a “registered image group”) are formed and registered in a database, and then at the time of authentication, the face image of the object is compared and collated with all the registered image groups that have been registered in the database, thereby improving the authentication rate (see, for example, T. Sim, T. Kanade, “Combining Models and Exemplars for Face Recognition: An Illuminating Example,” Proc. CVPR Workshop on Models versus Exemplars in Computer Vision, 2001).
According to the above-described conventional technique, first, a statistical model is made using a learning face image set (this image set is referred to as a “bootstrap image” hereinafter) that is taken under various illumination conditions, with respect to a person different from the previously registered person. Then, using the above statistical model, a plurality of registered image groups taken under different illumination conditions are synthesized from the registered images of the registered persons.
However, according to the above-described conventional technique, when the face shape (normal vector) and an error component and the like other than a diffuse reflection component at the time of shooting are estimated from one registered image with respect to each of the registered persons, since a calculation is performed with respect to each of pixels constituting the registered image separately, in a case where a pixel to be calculated is positioned in shadow, a normal vector and a reflectance (a so-called normal albedo vector) about the pixel cannot be always correctly estimated, so that there arises the problem that an actually matching registered image group cannot be created.