1. Field of the Invention
Embodiments of the present invention relate to face recognition.
2. Background Art
Face recognition uses computers to recognize a person from a digital image or a video frame. Face recognition can be used for a variety of purposes including identification, security, law enforcement, and digital photography and video. A number of methods have been developed for face recognition. For instance, a typical automatic face recognition (AFR) system is composed of three parts or levels: face detection, face alignment and face recognition. Given images containing faces, face detection locates a face, face alignment locates key feature points of the face, and face recognition determines whose face it is. Many algorithms have been proposed for human face recognition. However, these algorithms have focused only on each separate part of a face recognition system. Conventionally, these three parts are processed as follows: face detection is performed first, detection results are then passed to face alignment, and then results of face alignment are passed to face recognition. This is a bottom-up approach, as shown in FIG. 1. Another approach is the top-down approach, as shown in FIG. 2, which operates in a reverse order than a bottom-up approach.
In a typical bottom-up approach, each part or level provides data to the next level. It is a data-driven approach. This approach may use only class-independent information or information that is not specific to a class of persons. A class may be one or more specific persons to be recognized or identified. Typical bottom-up approaches may not rely on class-specific knowledge. For such AFR systems, face detection and face alignment may not use knowledge about the classes of persons to be recognized.
Also, for a bottom-up approach to be practical, domain-independent processing must be inexpensive and the input data for each part or level must be accurate and yield reliable results for the next level. As face detection and face alignment have become more inexpensive and more reliable, the bottom-up approach has become more dominant. However, there are two inherent problems. First, class-independent face detection and face alignment may fail for some classes of persons to be recognized. Second, if face detection fails to detect the face or if face alignment cannot correctly locate the feature points, the face recognition will usually fail.
Furthermore, with the bottom-up approach, conventional face alignment concentrates on general purpose face alignment (GPFA). It builds the model from faces of many persons other than the persons to be recognized in order to cover the variance of all the faces. Accordingly, it attains the ability of generalization at the cost of specialization. Moreover, GPFA does not consider its higher-level tasks or tasks beyond the immediate part or level. Different tasks may have different requirements. For example, face recognition needs good distinguishable features whereas face animation requires accurate positions of key points.
In the top-down approach, shown in FIG. 2, the higher level guides the lower level. With class-specific knowledge, the top-down approach could perform better for the objects that provide this knowledge. However, there are difficulties with the top-down approach. First, there may be large variations within the classes. If the variations cannot be properly modeled, they will introduce unexpected errors. Second, in order to model the large variations, various models may be used. Problems arise in determining how to choose these models for a particular test example. Third, building a model with class-specific knowledge may require more effort.