Biometric identification may be performed via an automated system capable of capturing a biometric sample or evidence from a user, extracting biometric data from the sample, comparing the biometric data with that contained in one or more reference templates, deciding how well they match, and indicating whether or not an authentication of identity or identification has been achieved.
Biometric identification based on face recognition is particularly useful for security applications and human-machine interfaces, and support vector machines (SVMs) are a class of learning algorithms for classification/regression that are particularly useful for high dimensional input data with either large or small training sets. Support vector machines suitable for identification problems work by mapping the input features to the SVM into a high-dimensional feature space, and computing linear functions on those mapped features in the high-dimensional feature space.
SVMs are generally trained through supervised learning, in which the best function that relates the output data to the input data is computed, and the goodness of this function is judged by its ability to generalize on new inputs, i.e., inputs which are not present in the training set. For a detailed description of learning methods for SVMs, reference may be made to N. Cristianini, J. Shawe-Taylor, An Introduction to Support Vector Machines and other kernel-based learning methods, pp. 93-122, Cambridge University Press, 2000.
Currently, several methods are known that propose the use of SVMs, alone or in combination with other recognition techniques, for face recognition and/or detection.
For example, B. Heisele, P. Ho, J. Wu, T. Poggio, Face Recognition: component-based versus global approaches, in Computer Vision and Image Understanding 1991, pp. 6-21, Elsevier, 2003 proposes three SVM-based face recognition methods, wherein the first one follows a so-called component-based approach, according to which the facial components are located, extracted, and combined in a single feature vector that is classified by a SVM. Briefly, the SVM-based recognition system decomposes the face into a set of components that are interconnected by a flexible geometrical model. The other two SVM-based face recognition methods are implementations of global systems, which recognize faces by classifying single feature vectors consisting of the gray values of the whole face image. In particular, in the first global system an SVM is created for each person in the database, whereas the second global system uses sets of view-specific SVMs that are clustered during training.
Another SVM-based face recognition system is proposed in L. Zhuang, H. Ai, G. Xu, Training Support Vector Machines for video based face recognition, Tsinghua University, Beijing, 2001, where two different strategies for m-class video-based face recognition problem with SVMs are discussed for global face feature sets and for Principal Component Analysis (PCA) compressed feature sets. In the case of global feature sets, normalized raw samples are considered as feature vectors of 2112 gray values for SVM training, while in the second case, the coefficients of PCA projection are used as feature vectors for training.
A further analysis of the use of SVMs in the context of face recognition is disclosed in K. Jonsson, J. Kittler, Y. P. Li, J. Matas, Support Vector Machines for Face Authentication, The 10th British Machine Vision Conference 1999, pp. 543-553. This paper supports the hypothesis that the SVM approach is able to extract the relevant discriminatory information from the training data, even when no complex transformations are performed on the original raw face images. Analyzing the results of the experiments in which faces were represented in both Principal Component and Linear Discriminant spaces, the authors come to the conclusion that SVMs have an inherent potential to capture the discriminatory features from the training data irrespective of representation and preprocessing.
Yet, US 2003/0103652 discloses a system and a method for performing face registration and authentication using face information. A set of readily distinguishable features for each user is selected at a registration step, and only the set of features selected at the registration step is used at a face authentication step, whereby memory use according to unnecessary information and amount of data calculation for face authentication can be reduced. Therefore, identity authentication through face authentication can be performed even under restricted environments of a USB token or smart card with limited resources, and authentication performance is improved, as readily distinguishable feature information is used, and the time for face authentication is reduced, as face authentication is performed using the SVM built by using the optimal set of readily distinguishable features at a training step.
Additionally, in S. M. Bileschi, B. Heisele Advances in component-based face detection, Pattern Recognition with Support Vector Machines, First International Workshop, SVM 2002, Proceedings (Lecture Notes in Computer Science Vol. 2388), pp. 135-43, a component-based face detection system trained only on positive examples is described. On the first layer, SVM classifiers detect predetermined rectangular portions of faces in gray scale images. On the second level, histogram-based classifiers judge the pattern using only the positions of maximization of the first level classifiers. In this approach, selected parts of the positive pattern are used as negative training for component classifiers, and the use of pair-wise correlation between facial component positions to bias classifier outputs and achieve increased component localization.
The Applicant has noted that in the field of biometric authentication based on facial recognition with m-class SVMs (that perform classification of data into more than two classes) a problem exits, namely, for each authorized user a huge number of user's face samples are required for the training of the SVMs so as to achieve a good level of recognition, i.e. a low error rate. This can lead to an enrollment process (i.e., a process of collecting biometric samples from a user and subsequently computing and storing a biometric reference template representing the user's identity) for each authorized user, taking a large amount of time and computational resources.
Generally, two approaches can be used for training m-class SVMs, the one-versus-all approach, and, respectively, the pair-wise approach.
Specifically, in the one-versus-all approach, SVMs are trained, each SVM separating a single class from all the remaining classes. As such, an SVM exists for each user in the authorized clients' database that recognizes/discriminates the user from any other user in the database.
In the pair-wise approach, m(m−1)/2 SVMs are trained, each separating a pair of classes. The SVMs are disposed in trees, where each tree node represents an SVM. In G. Guodong, S. Li, C. Kapluk, Face recognition by support vector machines, in Proc. IEEE International Conference on Automatic Face and Gesture Recognition, 2000, pp. 196, a bottom-up tree similar to the elimination tree used in tennis tournaments was applied to face recognition.
Both solutions are supervised learning procedures that need both positive and negative training examples, i.e., samples of the face of the user to be recognized, and, respectively, samples of faces of people different than the user to be recognized, and the limit of these solutions is that for a reliable recognition (i.e., a low error rate), an enormous number of negative examples are required. In the best case in terms of computational speed, in the one-versus-all approach, the number of negative examples has to be at least equal to the number of entries in the database minus one, all multiplied by a constant (for example, the number of possible head poses). Likewise, the second approach may become computationally very slow if the users' database increases. Of course, the algorithms performance depends on the available computational power, but generally these approaches may not scale well, with an enrollment process that may take several days (reference may, for example, be made to B. Heisele, T. Poggio, M. Pontil, Face Detection in still gray images, A. I. Memo 1687, Center for Biological and Computational Learning, MIT, Cambridge, 2000).