Tracking the 3-D motion of a face in a sequence of 2-D images of the face is an important problem with applications to facial animation, hands-free human-computer interaction environment, and lip-reading. Tracking the motion of the face involves tracking the 2-D positions of salient features on the face. The salient features could be in the form of (i) points, such as the corners of the mouth, the eye pupils, or external markers placed on the face; (ii) lines, such as the hair-line, the boundary of the lips, and the boundary of eyebrows; and (iii) regions, such as the eyes, the nose, and the mouth. There are known techniques for using markers to track objects such as selected facial features including eyebrows, ears, mouth and corners of the eyes.
The salient features can also be synthetically created by placing markers on the face. Tracking of salient features is generally accomplished by detecting and matching a plurality of salient features of the face in a sequence of 2-D images of the face. The problem of detecting and matching the salient features is made difficult by variations in illumination, occlusion of the features, poor video quality, and the real-time constraint on the computer processing of the 2-D images.