1. Field of the Invention
This invention relates to an image recognition apparatus, an image extraction apparatus, an image extraction method, and a program for recognizing the object of a face, etc.
2. Description of the Related Art
A system for conducting individual authentication by recognizing the face of an individual has captured attention with an increase in demand for security improvement by individual authentication in recent years.
Such an individual authentication system based on a face adopts a method for enhancing the accuracy of face recognition by detecting variation such as rotation of a face image or a shift from the center thereof, for example, as disclosed in JP-A-2003-281541.
Specifically, for example, in the method as disclosed in JP-A-2003-281541, to recognize variation of the face image, first a feature pattern (positions of eyes, position of mouth, etc.,) is extracted based on the feature amount of entropy, etc., for each area of the sample face image of the individual to be recognized. The feature pattern is stored as coordinates when it is projected onto a subspace. When the face is recognized, the feature pattern is extracted in a similar manner from the input face image of the individual. When the feature pattern is projected onto the subspace, if the degree of similarity between the projection and the projection of the stored feature pattern exceeds a threshold value, it is recognized that the input face image and the stored face image are the face images of the identical person.
At the time, however, high accuracy may not be provided because of variation such as inclination of the face. Then, using a large number of intentionally varied face images as teacher signals, variation examples of face images cause a neural net to learn and the feature patterns are input to the neural net, whereby normalized images with the effects of the variations removed are obtained.
The face image is obtained by cutting out objective image data as a complete round 102 with the midpoint between two eyes detected by calculating entropy for each partial area of the objective image data as a center 100, for example, as shown in FIG. 9. Here, the reason why a complete round is adopted is that cutting out the face of a human being as a complete round is a sufficient condition for extracting the necessary portion for feature pattern extraction from the face image. According to the former understanding, it was considered that if a background is entered, the accuracy of learning/recognition does not change as a result of forming subspace and learning by the neural net.
In fact, however, for example, as shown in FIG. 10, when the face does not face the front, if the image is cut out as a complete round 106 with the midpoint between two eyes as a center 104 as described above, a portion 108 other than the face portion (usually, background portion) is contained in the cut-out image and the effect of the background may make it difficult to extract a feature pattern or cause a large error to occur in extracting a feature pattern. Then, since the feature pattern input to the neural net has a large error, it becomes impossible to learn with good accuracy in a learning scene and it becomes impossible to provide significant output if normalization with the neural net is conducted in a recognition scene. Further, projection of the feature pattern onto subspace is also adversely affected.