1. Field of the Invention
This invention relates generally to object recognition by computers, and more specifically, to facial recognition techniques applied to a sequence of images.
2. Background Art
Computer vision through object recognition is a giant step in computer intelligence that provides a myriad of new capabilities. Facial recognition has particularly valuable applications in verifying a person's identity, robot interaction with humans, security surveillance, etc. With a reliable facial recognition system, computers can provide security clearance for authorized individuals, and robots can perform a set of actions designed for specific individual. However, when currently available facial recognition systems perform identifications, they are limited to basing such identification on a single image generated under ideal circumstances. Examples of currently available facial recognition systems include R. Chellappa, C. L. et al. “Human and Machine Recognition of Faces: A Survey,” Proceedings of the IEEE (1995); A. Samal et al. “Automatic recognition and Analysis of Human Faces and Facial Expressions: A Survey,” Pattern Recognition (1992); and W. Y. Zhao et al. “Face Recognition: A Literature Survey,” Technical Report CAR-TR-948, Center for Automation Research, University of Maryland (2000).
One problem with relying on ideal circumstances, such as assuming that an individual to be recognized is positioned in an ideal pose, is that circumstances are rarely ideal. In an ideal pose, a camera has full frontal view of the face without any head tilt. Any two-dimensional or three-dimensional rotations may either cause a false identification or prevent the camera from collecting a sufficient number of data points for comparison. Even when the individual attempts to position himself for the ideal image, misjudgment in orientation may still be problematic.
Obstacles between the individual's face and the camera create additional problems for conventional recognition systems. Since those systems are incapable of distinguishing an obstacle from the individual's face in a resulting image, the obstacle distorts any following comparisons. As with facial rotations, occluded faces can also prevent the camera from collecting sufficient data.
A problem related to non-ideal circumstances is that typical recognition systems use a single image, so if the single image is distorted, the identification will be affected. False identification can consequently result in security breaches, and the like. Even systems that incorporate more than one image in recognition, such as temporal voting techniques, are susceptible to false identifications. Temporal voting techniques make an identification for a first image, make an independent identification for a second image, and so on, in basing recognition the most frequent independent identification. Examples of temporal voting techniques include A. J. Howell and H. Buxton, “Towards Unconstrained Face Recognition From Image Sequences,” Proc. IEEE Int'l Conf. On Automatic Face and Gesture Recognition (1996); G. Shakhnarovich et al., “Face Recognition From Long-Term Observations,” Proce. European Conf. On Computer Vision (1992); and H. Wechsler et al. “Automatic Video-Based Person Authentication Using the RBF Network,” Proc. Int'l. Conf. On Audio and Video-Based Person Authentication (1997), which are incorporated by reference herein in their entirety. However, each identification is independent of other images. Thus, sustained pose variations and/or occlusions will still distort the outcome.
Therefore, what is needed is a robust facial recognition system that exploits temporal coherency between successive images to make recognition decisions. As such, the system should make accurate identification of target individuals in non-ideal circumstances such as pose variations or occlusions.