There exist extremely reliable methods for personal identification using biometric data such as e.g. fingerprints, retinal patterns or similar unique features of the subject that rely on the cooperation of the subject. Face recognition may be an effective way of identifying a person without the cooperation or knowledge of the person. There are two main general problems for a face recognition system; identifying a person, i.e. determine the identity from images, and verifying the identity of a person, i.e. to certify that the person is who he/she claims to be. Specific applications are e.g. immigration, ID-cards, passports, computer logon, intranet security, video surveillance and access systems. The present invention aims at increasing the performance and efficiency of such systems using geometric information available through the use of statistical shape models.
In the area of statistical shape models, the invention is related to the Active Shape Models (ASM), introduced by Cootes and Taylor, ([1]: Cootes T. F. and Taylor C. J, Active Shape Model Search using Local Grey-level Models: A Quantitative Evaluation, British Machine Vision Conference, p. 639-648, 1993). One distinction is that ASM have been used for inferring 2D shape from 2D observations or 3D shape from 3D observations whereas the invention uses 2D observations, i.e. images, to infer 3D shape. Also the observations are from multiple views (one or more imaging devices), something that is not handled in standard ASM. Cootes and Taylor have a number of patents in the area, the most relevant are (WO02103618A1—Statistical Model) where parameterisation of 2D or 3D shapes are treated, (WO0135326A1—Object Class Identification, Verification or Object Image Synthesis) where an object class is identified in images and (WO02097720A1—Object Identification) in which objects are identified using modified versions of ASM and related techniques. Also related is Cootes et al. ([2]: Cootes T. F., Wheeler G. V, Walker K. N and Taylor C. J., View-based Active Appearance Models, Image and Vision Computing, 20(9-10), p. 657-664, 2002.) where multi-view models are used but no explicit or consistent 3D data is contained in the model. There are also methods for deforming a 3D model of the object to fit the 2D projections of the object in the images such as in Blanz and Vetter ([3]: Blanz V. and Vetter T., Face Recognition Based on Fitting a 3D Morphable Model, IEEE Trans. on Pattern Analysis and Machine Intelligence, 25(9), p. 1063-1073, 2003.). These methods are very computationally expensive and often require manual intervention. Related patents are U.S. Pat. No. 6,556,196/EP1039417 (Method and apparatus for the processing of images) which describes a method for morphing a 3D model so that it will be a 3D representation of the object in the image by minimizing the projection error in the image.
One common problem for image based recognition is detecting the 2D shape of the object in the image, i.e. finding the relevant image region. Recent methods for detecting objects in images usually involve scanning the whole image at different scales for object specific image patterns and then using a classifier to decide if the region is relevant or not. The latest developments suggest the use of Support Vector Machines (SVM) for this task. A key element is the extraction of image features, i.e. parts of the image such as corners, edges and other interest points. This is usually done using correlation based schemes using templates or edge based methods using image gradients. For an overview of methods for face detection and feature extraction, cf. Zhao and Chellappa ([4]: Zhao W., Chellappa R., Rosenfeld A and Phillips P. J, Face Recognition: A Literature Survey, Technical report CAR-TR-948, 2000.) and the references therein. In [4] a review of current image based methods for face recognition is also presented.
When using image based methods for identification and verification there are two major problems, illumination variation and pose variation. Illumination variation will affect all correlation based methods where parts of images are compared since the pixel values vary with changing illumination. Also specular reflections can give rise to high changes in pixel intensity. Pose variation occurs since the projection in the image can change dramatically as the object rotates. These two problems have been documented in many face recognition systems and are unavoidable when the images are acquired in uncontrolled environments. Most of the known methods fail to handle these problems robustly.
The illumination problem is handled by the invention since no image correlation or comparison of image parts is performed. Instead features such as corners which are robust to intensity changes are computed, which make the shape reconstruction, to a large extent, insensitive to illumination and specular reflections. The invention handles the pose problem by using any number of images with different pose for training the statistical model. Any subset of the images, as few as a single image, can then be used to infer the 3D shape of the object.