The present invention relates to human face shape and motion estimation. More particularly, the present invention relates to such estimation based on integrating optical flow and deformable models.
A wide variety of face models have been used in the extraction and recognition of facial expressions in image sequences. Several 2-D face models based on splines or deformable templates have been developed which track the contours of a face in an image sequence. Terzopoulos and Waters (xe2x80x9cAnalysis and synthesis of facial image sequences using physical and anatomical models,xe2x80x9d IEEE Pattern Analysis and Machine Intelligence, 15(6):569-579, 1993) and Essa and Pentland (xe2x80x9cFacial expression recognition using a dynamic model and motion energy,xe2x80x9d in Proceedings ICCV ""95, pages 360-367, 1995) use a physics-based 3-D mesh with many degrees of freedom, where face motion is measured in terms of muscle activations. Edge forces from snakes are used in the former, while in the latter, the face model is used to xe2x80x98clean upxe2x80x99 an optical flow field that is used in expression recognition.
Another approach is to directly use the optical flow field from face images. Yacoob and Davis (xe2x80x9cComputing spatio-temporal representations of human faces,xe2x80x9d Proceedings CVPR ""94, pages 70-75, 1994) use statistical properties of the flow for expression recognition. Black and Yacoob (xe2x80x9cTracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion,xe2x80x9d Proceedings ICCV ""95, pages 374-381, 1995) parameterize the flow field based on the structure of the face under projection. Addressing the problem of image coding, Li, et al. (xe2x80x9c3-D motion estimation in model-based facial image coding,xe2x80x9d PAMI, 15(6):545-555, Jun. 1993) estimate face motion using a simple 3-D model by a combination of prediction and a model-based least-squares solution to the optical flow constraint equation. A render-feedback loop is used to combat error accumulation in tracking.
However, none of these approaches permits large head rotations due to the use of a 2-D model or the inability to handle self-occlusion. Also, none of the previous work makes a serious attempt in extracting the 3-D shape of the face from an image sequence. At best, the boundary of face parts are located to align the model with an image. Finally, none of the previous face tracking work integrates multiple cues in the tracking of the face.
Accordingly, a system is desired which uses a 3-D model and allows the tracking of large rotations by using self-occlusion information from the model. A system is also desired which extracts the shape of the face using a combination of edge forces and anthropometry information. Moreover, a system is desired which can easily augment the optical flow solution with additional information to improve such solution and which permits the use of a small number of image points to sample the optical flow field, as well as the computation of edge forces to prevent error accumulation in the motion. The present invention has been developed to meet these needs in the art.
The present invention satisfies the aforementioned needs by providing a method and apparatus for human face shape and motion estimation based on integrating optical flow and deformable models. The optical flow, constraint equation provides a non-holonomic constraint on the motion of the deformable model. Forces computed from edges and optical flow are used simultaneously. When this dynamic system is solved, a model-based least-squares solution for the optical flow is obtained and improved estimation results are achieved. The use of a 3-D model reduces or eliminates problems associated with optical flow computation. This approach instantiates a general methodology for treating visual cues as constraints on deformable models. The model, which applied to human face shape and motion estimation, uses a small number of parameters to describe a rich variety of face shapes and facial expressions.