The present invention relates to field of face recognition systems. More specifically, the present invention robustly authenticates facial images by using recognition-by-parts, boosting, and transduction.
Understanding how people process and recognize each other's face and developing robust face recognition systems still remain a grand challenge for computational intelligence, in general, and computer vision, in particular. The face recognition challenge belongs to biometrics, the science of authenticating people from measuring their physical or external appearance. In addition to security and surveillance, the ability to recognize living creatures has also become a critical enabling technology for a wide range of applications that includes defense, health care, human-computer interaction, image retrieval and data mining, industrial and personal robotics, and transportation.
Face recognition is largely motivated by the need for surveillance and security, telecommunication and digital libraries, human-computer intelligent interaction, and smart environments. Some of these security uses may include log in control and physical access control. Additional applications may include law enforcement purposes, such as mug shot albums, criminology, and commercial transactions that involve the use of credit cards, driver's licenses, passports, or other photo identifications. Virtually all applications that depend upon the identification of a person could benefit from this technology.
The solutions suggested so far are synergetic efforts from fields, such as signal and image processing, pattern recognition, machine learning, neural networks, statistics, evolutionary computation, psychophysics of human perception and neurosciences, and system engineering. A generic approach often used involves statistical estimation and the learning of face class statistics for subsequent face detection and classification. Face detection generally applies a statistical characterization of faces and non-faces to build a classifier, which may then be used to search over different locations and scales for image patterns that are likely to be human faces.
Face recognition usually employs various statistical techniques to derive appearance-based models for classification. Some of these techniques include, but are not limited to, Principal Component Analysis (hereinafter referred to as PCA); Fisher Linear Discriminant (hereinafter referred to as FLD), which is also known as Linear Discriminant Analysis (hereinafter referred to as LDA); Independent Component Analysis (hereinafter referred to as ICA); Local Feature Analysis (hereinafter referred to as LFA); and Gabor and bunch graphs. Descriptions of PCA may be found in: [M. Turk and A. Pentland, “Eigenfaces for Recognition,” 13 J. Cognitive Neurosci, 71-86 (1991], and [B. Moghaddam and A. Pentland, “Probabilistic Visual Learning for Object Representation,” 19 IEEE Trans. Pattern Analysis and Machine Intel. 696-710 (1997)]. Descriptions of FLD and LDA may be found in: [D. L. Swets and J. Weng, “Using Discriminant Eigenfeatures for Image Retrieval,” 18 IEEE Trans. Pattern Analysis and Machine Intel 831-36 (1996)]; [P. N. Belhumeur et al., “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,” 19 IEEE Trans. Pattern Analysis and Machine Intel 711-20 (1997)], and [K. Etemad and R. Chellappa, “Discriminant Analysis for Recognition of Human Face Images,” 14 J. Opt. Soc. Am. A 1724-33 (1997)]. A description of ICA may be found in: [G. Donato et al., “Classifying Facial Actions,” 21 IEEE Trans. Pattern Analysis and Machine Intel 974-89 (1999)]. LFA is described in: [P. S. Penev and J. J. Atick, “Local Feature Analysis: A General Statistical Theory for Object Representation,” 7 Network: Computation in Neural Sys. 477-500 (1996).]
Face recognition may depend heavily on the particular choice of features used by the classifier. One usually starts with a given set of features and then attempts to derive an optimal subset (under some criteria) of features leading to high classification performance with the expectation that similar performance may be also displayed on future trials using novel (unseen) test data. PCA is a popular technique used to derive a starting set of features for both face representation and recognition. Kirby and Sirovich showed that any particular face may be (i) economically represented along the eigenpictures coordinate space, and (ii) approximately reconstructed using just a small collection of eigenpictures and their corresponding projections (‘coefficients’). [M. Kirby and L. Sirovich, “Application of the Karhunen-Loeve Procedure for the Characterization of Human Faces,” 12 IEEE Trans. Pattern Analysis and Machine Intel 103-08 (1990)].
Applying the PCA technique to face recognition, Turk and Pentland developed a well-known eigenface method that sparked an explosion of interests in applying statistical techniques to face recognition. However, PCA, an optimal representation criterion (in the sense of mean square error), does not consider the classification aspect. One solution for taking into account and improving the classification performance is to combine PCA, the optimal representation criterion, with the Bayes classifier, the optimal classification criterion (when the density functions are given). Toward that end, Moghaddam and Pentland developed a probabilistic visual learning method, which uses the eigenspace decomposition as an integral part of estimating complete density functions in high-dimensional image space. While the leading eigenvalues are derived directly by PCA, the remainder of the eigenvalue spectrum is estimated by curve fitting.
Rather than estimating the densities in high-dimensional space, Liu and Wechsler developed a PRM (Probabilistic Reasoning Model) method by first applying PCA for dimensionality reduction and then applying the Bayes classifier and the MAP rule for classification. [C. Liu and H. Wechsler, “Robust Coding Schemes for Indexing and Retrieval from Large Face Databases,” 9 IEEE Trans. Image Processing 132-37 (2000)]. The rationale of the PRM method is that of lowering the space dimension subject to increased fitness for the discrimination index by estimating the conditional density function of each class using the within-class scatter in the reduced PCA space.
Another important statistical technique widely used in face recognition is the FLD (or LDA), which models both the within- and the between-class scatters. FLD, which is behind several face recognition methods, induces non-orthogonal projection bases, a characteristic known to have great functional significance in biological sensory systems [J. G. Daugman, “An Information-Theoretic View of Analog Representation in Striate Cortex,” Computational Neuroscience 403-24 (MIT Press 1990)]. As the original image space is highly dimensional, most face recognition methods perform first dimensionality reduction using PCA, as it is the case with the Fisherfaces method suggested by Belhumeur et al. Swets and Weng have pointed out that the eigenfaces method derives only the Most Expressive Features (MEF) and that PCA inspired features do not necessarily provide for good discrimination. As a consequence, the subsequent FLD projections are used to build the Most Discriminating Features (MDF) classification space. The MDF space is, however, superior to the MEF space for face recognition only when the training images are representative of the range of face (class) variations. Otherwise, the performance difference between the MEF and MDF is not significant.
The drawback of FLD is that it requires large sample sizes for good generalization. For a face recognition problem, however, usually there are a large number of faces (classes), but only a few training examples per face. One possible remedy for this drawback, according to Etemad and Chellappa, is to artificially generate additional data and thus increase the sample size. Yet another remedy, according to Liu and Wechsler, is to improve FLD's generalization performance by balancing the need for adequate signal representation and subsequent classification performance using sensitivity analysis on the spectral range of the within-class eigenvalues.
Other developments, which are conceptually relevant to the face recognition community in general, include LFA, and the related Dynamic Link Architecture (hereinafter referred to as DLA) [M. Lades et al., “Distortion Invariant Object Recognition in the Dynamic Link Architecture,” 42 IEEE Trans. Computers 300-11 (1993)], and elastic graph matching methods [L. Wiskott et al., “Face Recognition by Elastic Bunch Graph Matching,” 19 IEEE Trans. Pattern Analysis and Machine Intel 775-79 (1997)]. LFA uses a sparse version of the PCA transform, followed by a discriminative network. DLA starts by computing Gabor jets, and then it performs a flexible template comparison between the resulting image decompositions using graph-matching.
While each of these techniques aid in face recognition, they are slow. Thus, what is needed is a face authentication system that can address these problems efficiently and economically.