Monocular face tracking is the process of estimating facial motion, position, and shape based on monocular image sequences from a stationary camera. Monocular face tracking is a main process in many image processing systems such as a video conferencing system. For instance, in a video conferencing system, by estimating facial motion or position, the amount of facial data or information that needs to be exchanged or processed is reduced. That is, parameters related to the estimated facial motion, position, and shape can be exchanged or processed for outputting an image sequence instead of exchanging or processing large amounts of image data.
One type of face tracking system is a face tracking system based on markers (“marker face tracking system”). In a marker face tracking system, a user is required to wear color “markers” at known locations. The movement of the markers are thus parameterized to estimate facial position and shape. A disadvantage of the marker face tracking system is that it is invasive on the user. In particular, the user must place a number of color markers on varying positions of the face. Furthermore, the user must spend time putting on the markers, which adds a further complexity to using such a system.
Another type of face tracking system is a model-based face tracking system. A model-based face tracking system uses a parameterized face shape model that can be used to estimate facial position and motion. In prior model-based face tracking systems, parameterized models are built using a manual process, e.g., by using a 3D scanner or a computer aided design (CAD) modeler. Hence, a disadvantage of prior model-based face tracking systems is that manual building of face shape models is very ad-hoc, which leads to a trial and error approach to obtain tracking models. Such an ad-hoc process provides inaccurate and suboptimal models.