The present invention relates to field of face recognition systems. More specifically, the present invention utilizes a novel Gabor Feature Classifier for face recognition.
Understanding how people process and recognize each other""s face, and the development of corresponding computational models for automated face recognition are among the most challenging tasks for visual form (xe2x80x98shapexe2x80x99) analysis and object recognition. The enormity of the problem has involved hundreds of scientists in interdisciplinary research, but the ultimate solution is yet to come.
Face recognition is largely motivated by the need for surveillance and security, telecommunication and digital libraries, human-computer intelligent interaction, and smart environments. Some of these security uses may include log in control and physical access control. Further applications may include: law enforcement uses such as mug shot albums, and criminology; and commercial transaction which use credit cards, driver""s licenses, passports, or other photo ID identifications. Virtually all applications that depend upon the identification of a person could benefit from this technology.
The solutions suggested so far are synergetic efforts from fields such as signal and image processing, pattern recognition, machine learning, neural networks, statistics, evolutionary computation, psychophysics of human perception and neurosciences, and system engineering. A generic approach often used involves statistical estimation and learning of face class statistics for subsequent face detection and classification. Face detection generally applies a statistical characterization of faces and non-faces to build a classifier, which may then used to search over different locations and scales for image patterns that are likely to be human faces.
Face recognition usually employs various statistical techniques to derive appearance-based models for classification. Some of these techniques include but are not limited to: principal component analysis (hereinafter referred to as PCA); Fisher linear discriminant (hereinafter referred to as FLD) which are also known as linear discriminant analysis (hereinafter referred to as LDA); independent component analysis (hereinafter referred to as ICA); local feature analysis (hereinafter referred to as LFA); and Gabor and bunch graphs. Descriptions of PCA may be found in: [M. Turk and A. Pentland, xe2x80x9cEigenfaces for recognition,xe2x80x9d Journal of Cognitive Neuroscience, vol. 13, no. 1, pp. 71-86, 1991], and [B. Moghaddam and A. Pentland, xe2x80x9cProbabilistic visual learning for object representation,xe2x80x9d IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 696-710, 1997]. Descriptions of FLD and LDA may be found in: [D. L. Swets and J. Weng, xe2x80x9cUsing discriminant eigenfeatures for image retrieval,xe2x80x9d IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 8, pp. 831-836, 1996.]; [P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, xe2x80x9cEigenfaces vs. Fisherfaces: Recognition using class specific linear projection,xe2x80x9d IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711-720, 1997.], and [K. Etemad and R. Chellappa, xe2x80x9cDiscriminant analysis for recognition of human face images,xe2x80x9d J. Opt. Soc. Am. A, vol. 14, pp. 1724-1733, 1997]. A description of ICA may be found in: [G. Donato, M. S. Bartlett, J. C. Hager, P. Ekman, and T. J. Sejnowski, xe2x80x9cClassifying facial actions,xe2x80x9d IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 10, pp. 974-989, 1999]. LFA is described in [P. S. Penev and J. J. Atick, xe2x80x9cLocal feature analysis: A general statistical theory for object representation,xe2x80x9d Network: Computation in Neural Systems, vol. 7, pp. 477-500, 1996]. A description of Gabor and bunch graphs may be found in [L. Wiskott, J. M. Fellous, N. Kruger, and C. von der Malsburg, xe2x80x9cFace recognition by elastic bunch graph matching,xe2x80x9d IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 775-779, 1997].
Face recognition may depend heavily on the particular choice of features used by the classifier. One usually starts with a given set of features and then attempts to derive an optimal subset (under some criteria) of features leading to high classification performance with the expectation that similar performance may be also displayed on future trials using novel (unseen) test data. PCA is a popular technique used to derive a starting set of features for both face representation and recognition. Kirby and Sirovich showed that any particular face may be (i) economically represented along the eigenpictures coordinate space, and (ii) approximately reconstructed using just a small collection of eigenpictures and their corresponding projections (xe2x80x98coefficientsxe2x80x99). [M. Kirby and L. Sirovich, xe2x80x9cApplication of the Karhunen-Loeve procedure for the characterization of human faces,xe2x80x9d IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 1, pp. 103-108, 1990].
Applying PCA technique to face recognition, Turk and Pentland developed a well-known Eigenfaces method that sparked an explosion of interests in applying statistical techniques to face recognition. [M. Turk and A. Pentland, xe2x80x9cEigenfaces for recognition,xe2x80x9d Journal of Cognitive Neuroscience, vol. 13, no. 1, pp. 71-86, 1991]. PCA, an optimal representation criterion (in the sense of mean square error), does not consider the classification aspect. One solution for taking into account and improving the classification performance is to combine PCA, the optimal representation criterion, with the Bayes classifier, the optimal classification criterion (when the density functions are given). Toward that end, Moghaddam and Pentland developed a probabilistic visual learning method, which uses the eigenspace decomposition as an integral part of estimating complete density functions in high-dimensional image space. [B. Moghaddam and A. Pentland, xe2x80x9cProbabilistic visual learning for object representation,xe2x80x9d IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 696-710, 1997]. While the leading eigenvalues are derived directly by PCA, the remainder of the eigenvalue spectrum is estimated by curve fitting.
Rather than estimating the densities in high-dimensional space, Liu and Wechsler developed a PRM (Probabilistic Reasoning Models) method by first applying PCA for dimensionality reduction and then applying the Bayes classifier and the MAP rule for classification. [C. Liu and H. Wechsler, xe2x80x9cRobust coding schemes for indexing and retrieval from large face databases,xe2x80x9d IEEE Trans. on Image Processing, vol. 9, no. 1, pp. 132-137, 2000]. The rationale of the PRM method is that of lowering the space dimension subject to increased fitness for the discrimination index by estimating the conditional density function of each class using the within-class scatter in the reduced PCA space.
Another important statistical technique widely used in face recognition is the FLD, which models both the within- and the between-class scatters. FLD, which is behind several face recognition methods, induces non-orthogonal projection bases, a characteristic known to have great functional significance in biological sensory systems [J. G. Daugman, xe2x80x9cAn information-theoretic view of analog representation in striate cortex,xe2x80x9d in Computational Neuroscience, E. L. Schwartz, Ed., pp. 403-424. MIT Press, 1990]. As the original image space is highly dimensional, most face recognition methods perform first dimensionality reduction using PCA, as it is the case with the Fisherfaces method suggested by Belhumeur, Hespanha, and Kriegman [P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, xe2x80x9cEigenfaces vs. Fisherfaces: Recognition using class specific linear projection,xe2x80x9d IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711-720, 1997]. Swets and Weng have pointed out that the Eigenfaces method derives only the Most Expressive Features (MEF) and that PCA inspired features do not necessarily provide for good discrimination. As a consequence, the subsequent FLD projections are used to build the Most Discriminating Features (MDF) classification space. [D. L. Swets and J. Weng, xe2x80x9cUsing discriminant eigenfeatures for image retrieval,xe2x80x9d IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 8, pp. 831-836, 1996.] The MDF space is, however, superior to the MEF space for face recognition only when the training images are representative of the range of face (class) variations; otherwise, the performance difference between the MEF and MDF is not significant. The drawback of FLD is that it requires large sample sizes for good generalization. For a face recognition problem, however, usually there are a large number of faces (classes), but only a few training examples per face. One possible remedy for this drawback is to artificially generate additional data and thus increase the sample size [K. Etemad and R. Chellappa, xe2x80x9cDiscriminant analysis for recognition of human face images,xe2x80x9d J. Opt. Soc. Am. A, vol. 14, pp. 1724-1733, 1997]. Yet another remedy is to improves FLD""s generalization performance by balancing the need for adequate signal representation and subsequent classification performance using sensitivity analysis on the spectral range of the within-class eigenvalues [C. Liu and H. Wechsler, xe2x80x9cRobust coding schemes for indexing and retrieval from large face databases,xe2x80x9d IEEE Trans. on Image Processing, vol. 9, no. 1, pp. 132-137, 2000].
Other developments, which are conceptually relevant to the face recognition community in general, and to this paper in particular, include the Local Feature Analysis (hereinafter referred to as LFA) method due to Penev and Atick [P. S. Penev and J. J. Atick, xe2x80x9cLocal feature analysis: A general statistical theory for object representation,xe2x80x9d Network: Computation in Neural Systems, vol. 7, pp. 477-500, 1996] and the related Dynamic Link Architecture (hereinafter referred to as DLA) [M. Lades, J. C. Vorbruggen, J. Buhmann, J. Lange, C. von der Malsburg, Wurtz R. P., and W. Konen, xe2x80x9cDistortion invariant object recognition in the dynamic link architecture,xe2x80x9d IEEE Trans. Computers, vol. 42, pp. 300-311, 1993] and elastic graph matching methods. [L. Wiskott, J. M. Fellous, N. Kruger, and C. von der Malsburg, xe2x80x9cFace recognition by elastic bunch graph matching,xe2x80x9d IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 775-779, 1997] LFA uses a sparse version of the PCA transform, followed by a discriminative network. The DLA starts by computing the Gabor jets, and then it performs a flexible template comparison between the resulting image decompositions using graph-matching.
A method of face recognition that used Gabor filters and Fisher Linear Discriminants is disclosed in U.S. Pat. No. 6,219,640 to Basu et al., entitled xe2x80x9cMethods and Apparatus for Audio-Visual Speaker recognition and utterance verification.xe2x80x9d The ""640 patent uses Gabor filters to find features, such as large-scale features (eyes, nose, and mouth) and 29 sub-features (hairline, chin, ears, . . . ). Gabor Jet representations are generated at estimated sub-feature locations. The representations use complex coefficients encompassing thousands of complex numbers. Image Processing as described in the ""640 patent, where features (such as eyes, nose, hairline, chin . . . ) are detected is time-consuming and not robust because of it""s heavy computational cost and the unreliable results. What is needed is a full facial Gabor wavelet transformation where no feature detection is involved.
The ""640 patent also discloses the application of Fisher Linear Discriminants to detect facial features. The FLD is applied as a binary classifier for feature detection rather than for face recognition. It is well-known that FLD does not generalize gracefully. (see ref. [10]: Donato et al. xe2x80x9cClassifying Facial Actionsxe2x80x9d, IEEE Trans. Pattern Analysis and Machine Intelligence, vol 21(10), 1999). A system with an improved FLD generalization ability would overcome this patents generalization properties.
What is needed is a face recognition system wherein an image may be matched against sample images efficiently and economically. This system will preferably have a highly generalized ability and not require any discrete feature matching.
One advantage of the present invention is that it is that it derives statistical features from a whole image, hence there is no need for facial feature point detection.
Another advantage of this invention is that it is that it derives an augmented Gabor feature vector, whose dimensionality may be further reduced using the EFM by considering both data compression and recognition (generalization) performance.
Another advantage of this invention is that its derived Gabor wavelet utilizes a new normalization procedure and an enhanced LDA that is optimal for both representation and recognition purposes.
A further advantage of this invention is that it integrates Gabor and EFM, which considers not only data compression, recognition accuracy, but also generalization performance.
To achieve the foregoing and other advantages, in accordance with all of the invention as embodied and broadly described herein, a method for determining similarity between an image and at least one training sample, comprising the steps of: iteratively for the image and each of the training sample(s): generating a preprocessed image; calculating an augmented Gabor feature vector from the preprocessed image; deriving a lower dimensional feature vector from the augmented Gabor feature vector; processing the lower dimensional feature vector with a lower dimensional feature space discriminator; deriving an overall transformation matrix from the lower dimensional feature vector and the lower dimensional feature space discriminator; and calculating an image feature vector; and determining a similarity measure for the image and the at least one training sample.
In yet a further aspect of the invention, a system for determining similarity between an image and at least one training sample. The image may be preprocessed by an image preprocessor capable of generating a preprocessed image from the image. An augmented Gabor feature vector calculator may then generate a augmented Gabor feature vector from the preprocessed image. Next, a lower dimensional feature vector deriver may derive a lower dimensional feature vector from the augmented Gabor feature vector. A lower dimensional feature space processor may then create a discriminated lower dimensional feature space vector by process the lower dimensional feature vector with a lower dimensional feature space discriminator. An overall transformation matrix deriver may then derive an overall transformation matrix from the discriminated lower dimensional feature vector. An image feature vector calculator may then calculate an image feature vector using the overall transformation matrix, and a similarity measure calculator then preferably determines a similarity measure for the image and the at least one training sample.
Additional objects, advantages and novel features of the invention will be set forth in part in the description which follows, and in part will become apparent to those skilled in the art upon examination of the following or may be learned by practice of the invention. The objects and advantages of the invention may be realized and attained by means of the instrumentalities and combinations particularly pointed out in the appended claims.