1. Field of the Invention
This invention relates generally to image processing systems and more particularly to a system and method for automatic extraction of demographic information (age, gender, or ethnicity) from an image.
2. Background of the Invention
Human faces provide us with a plethora of information that is valuable and necessary for social interaction. When we encounter a face, we can quickly and successfully decide whether it is one we know. For faces of people we know, we can easily retrieve semantic and identity information about the person. Furthermore, from both familiar and unfamiliar faces we can make gender, ethnicity, and age estimation for a person.
Automated collection of demographic information has numerous application and has the potential of not only enhancing the existing HCI system but can also serve as platform for passive surveillance (for e.g., alerting medical authorities if there is a accident in old age home). It can also be used for development of new HCI application (e.g., helping the prospective buyers in choosing a product, or cigarette vending machines based on age verification), immersive computer games (for e.g., changing scenarios and multimedia content based on demographic preferences), collecting retail business information (e.g., the number of women entering a retail store on a given day), image retrieval (for e.g., accessing all images belonging to babies), enhancing identity verification (for e.g., ATM where in real time the demographic information of the user can be verified against a existing database to provide enhanced security), and advertising (for e.g., focusing on a particular demographic group for selling a product).
U.S. Pat. No. 5,781,650 to De Lobo describes an automatic feature detection and age classification method for human face in images. Their automatic age categorization system is based on finding a face in an image and locating the facial features. Using these facial features, distances between them and by performing wrinkle analysis of the skin they categorize the age of the human face in the image. In the paper titled “Age Classification for Facial Images”, Young H. Kwon and Niels De Vitoria Lobo, Computer Vision and Image Understanding, 74(1), pp. 1-21, 1991, they used cranio-facial development theory and wrinkle analysis for age classification. In their invention, they did not use components for classifying age and did not have a mechanism for fusion of classifier results. Furthermore, their system cannot be applied in the current form for ethnicity and gender classification.
U.S. Pat No. (Application) 60/421,717 to Sharma describe another method for automatic age category classification based on Support Vector Machines (SVM) where they use full-face image for classification. Their system is not based on facial components for classification purposes.
U.S. Pat No. (Application) 60/436,933 to Sharma et. al, describes method for classifying human faces images according to ethnicity using SVM. Their system is based on full-face images and does not use facial components for ethnicity classification.
US Pat No. (Application) US 20030110038A1 granted to Sharma et. al, describes a method for Multi-Modal Gender classification. Their method is based on performing gender classification using acoustic signals and face images of the user using statistical classification algorithms. Their method did not use components for gender classification and used full-face images.
U.S. Pat. No. 6,421,463 granted to Poggio et. al, describes a method for detecting human in an image using components. In their method is based on detecting the different human body components using wavelets in an image and classifying these components. The output of these components is fused together to give the final output. In the research paper titled “Example-Based Object Detection in Images by Components”, Anuj Mohan, Constantine Papageorgiou, and Tomaso Poggio, IEEE Transaction on Pattern Analysis and Machine Intelligence, 23(4), pp. 349-364, 2001, identified four components namely, head, legs, left arm, and right arm on the basis of wavelet transforms to perform Pedestrian Detection. They did not apply their system or method for demographic classification. Moreover, their invention is based on wavelets transforms for classification. Furthermore, Poggio's patent does not use or does not clarify the classifier fusion mechanism.
US Pat No. US2001/0,036,298 granted to Yamada et. al, describes a classification methodology for detection, recognition, and identification of age and gender using Left and Right Eye and region between the eyes. Their system is restricted only to the eyes and does not include any other component of human body or facial feature for classification.
Patent by Perona et. al, Pat No. (Application) US20030026483A1, describes a method for object detection using features. They used expectation maximization to assess a joint probability of which features are most relevant. Their invention defines a statistical model in which shape variability is modeled in a probabilistic setting. The research paper titled “Finding Faces in Cluttered Scenes using Random Labeled Graph Matching”, T. K. Leung, M. C. Burl, and P. Perona, Fifth International Conference on Computer Vision, 1995, identified five features namely left eye, right eye, left nostril, right nostril, and mouth by randomly labeled graph matching algorithm and identified faces using joint probabilistic model of faces. There system is not suited to demographic classification as probabilistic model for any two demographics class are very similar to each other and hence undistinguishable.
Patent granted to Viola, US Pat No. (Application) US20020102024A1, describes a method for object detection using integral image representation of the input image. The object detector uses cascade of homogenous classification functions or classifiers. Their invention defines a fast method for object detection using rectangular components defined by wavelets. The research paper titled “A Unified Learning Framework for Real Time Face Detection & Classification”, Gregory Shakhnarovich, Paul Viola, and Baback Moghaddam, International Conference on Automatic Face and Gesture Recognition, 2002, performed demographic classification using integral image. It calculates the integral image rather than classifying on each component and the result is integrated over time. Furthermore, their system is based on wavelets to identify components.
Moghaddam et. al. in “Gender Classification with Support Vector Machines”, IEEE International Conference on Automatic and Gesture Recognition, pp. 306-311, 2000, performed gender classification from full face images using Support Vector Machines. They system did not use components for classification. Moreover, they did not show that their system could be applied to Ethnicity and age classification. Gutta et. al, in “Mixture of Experts for Classification of Gender, Ethnic Origin, and Pose of Human Faces”, IEEE Transaction on Neural Networks, 11(4), pp. 948-960, 2000, performed gender and ethnicity classification using Radial Basis Function and Inductive Trees. Their system did not use components for classification purpose.
Wiskott et. al, “Face Recognition and Gender Determination”, pp. 92-97, 1995, used Elastic Graph Matching on full face images to perform gender classification. They did not use components for classification purpose.
Bebis et. al, “Neural-Network-Based Gender Classification Using Genetic Search for Eigen-Feature Selection”, IEEE World Congress on Computational Intelligence, 2002, used Neural Networks, Genetic Algorithms and PCA to do gender classification. They did not use components for gender classification.
Patent granted to Player, US Pat No. (Application) US20020052881A1, shows an example of use of demographic information for customizing computer games and advertising. They did not show any method or system for extracting demographic information from images or videos.