1. Field of the Invention
The present invention relates to an apparatus and a method, for performing a face image processing operation such as a face recognition and a detection of a face direction from entered images by employing a plurality of cameras (multiple cameras).
2. Background Art
Recognition using face images may constitute very useful techniques in view of securities, since there is no risk of lost and forgetfulness, which are different from physical keys and passwords. However, since sizes of faces are changed due to individual differences and furthermore directions of faces are not constant, variations contained in patterns of these faces must be necessarily absorbed in order to execute high-precision recognition. These individual differences are caused by standing positions of users and back shapes of users.
As conventional techniques for executing individual recognition by employing face images, for instance, the following publication has been proposed:
“Face recognition system ‘smartface’ being robust with respect to change in face directional expressions” written by Yamaguchi and Fukui, Japanese Telecommunication Institute publication (D-II), volume J4-D-II, No. 6, pages 1045 to 1052 in 2001.
This conventional individual recognizing method is such a recognizing method that variations of face patterns are suppressed by employing moving pictures, and then, individual recognition is carried out. To perform high-precision recognition, it is important to collect various sorts of personal face patterns from the moving pictures. However, there is a problem that the acquisitions of these face patterns may depend upon motion of their faces of users.
In security systems utilizing face image recognition, there are high risks of unfair accesses by using face photographs. As a result, there is a need to carry out a correct discrimination whether entered face images correspond to actual faces, or face photographs. In such a security system constituted by only one video camera, when image qualities of photographs are deteriorated, it is difficult to distinguish actual faces from face photographs based upon only image information. In order to perform correct discrimination, utilization of one technical idea is conceivable. That is, the “shape-from-motion” technique capable of extracting three-dimensional information from motion of subjective articles may be utilized. However, feature points capable of correctly extracting therefrom three-dimensional information can be hardly detected from human faces which correspond to non-rigid objects having less texture. As a consequence, the security system arranged by only one video camera without using a specific appliance owns a problem. That is, this security system can hardly judge as to whether or not the subject corresponds to the photograph in the correct manner.
Since there is a limitation in the field angle of the single video camera as to the image captured by this single video camera, problems as to occlusions and reflections may occur. That is, feature points of faces are hidden because of directions of a face and reflections of spectacles. As a result, it is practically difficult to continuously detect all of these face feature points due to adverse influences caused by hiding of the face feature points and shadow. Also, since the face feature point located at the correct position is hidden, another problem occurs. That is, detection results may readily become unstable and inaccurate, for instance, this face feature point cannot be detected. Otherwise, even when this face feature point can be detected, such a point which is shifted from the originally correct point is erroneously detected.
When trying to understand human actions from images, it is very important information towards which direction a person faces. Therefore, there is a need to detect directions of faces robustly. Conventionally, the following methods are known: cutting out a specific face region from an image derived from a single camera, and matching the cut face region with templates of the face region photographed at various angles; extracting feature points and calculating a face direction based upon geometrical information; detecting a three-dimensional shape of a face and calculating a direction of the face. However, in these conventional methods, the following problems may occur: In the template matching method, since only one image is compared, the high-precision detection can be hardly realized. In the geometrical face-direction calculating method, the feature points of the face can be hardly extracted in the correct manner, and also, since the face is the non-rigid subject, the calculated face direction contains the error. In the method of utilizing the three-dimensional shape of the face, such a specific appliance as a range finder is required and in the stereo-image method requires the calculation cost, whereby restoring of the face shape itself can be hardly realized.
Also, in the conventional face recognition systems, for example, in JP-A-2002-183734, plural sets of cameras may be employed. However, in the case that a total number of cameras is merely increased, another problem may occur. That is, plural users are mixed with each other. For instance, assuming now that a total number of face which can be detected one time is equal to only 1, when a plurality of users are photographed by the multiple cameras, only one user can be detected by a certain camera, and only another user can be detected by another camera. At this time, in the case that the face features of the detected users are dealt with as those belonging to the same person, information of different people is mixed, thereby causing erroneous face recognition.
Moreover, if such a phenomenon occurs in which other persons are mixed with the own user when the own user is registered, not only the information about different users are registered, the following security problem may occur. That is, assume that the users are “A” and “B” and the user “B” corresponds to a person who is not allowed to be registered, the user “B” may provide a high similarity measure with respect to the registration data of the user “A”. Therefore, the user “B” may pass through a gate while this user “B” poses as the user “A”. Also, even when a plurality of faces can be detected one time, if the corresponding relationship among the cameras of these detectable faces cannot be established, then the others mixture problem may similarly occur.
As previously explained, there are various problems in the conventional face image techniques with employment of the face image. That is, the various sorts of face pattern images as to individuals must be collected, and the recognition precision is restricted due to variations in standing positions or face directions of persons. Also, when the personal identification is carried out by using a plurality of cameras, there is another problem that since plural users are mixed with each other, the personal identification is erroneously carried out.