A constant pursuit in the field of computer animation is to enhance the realism of computer generated images. A related goal is to develop techniques for creating 3D models of real, moving objects that accurately represent the color and shading of the object and the changes in the object's appearance as it moves over time.
One of the most elusive goals in computer animation has been the realistic animation of the human face. Possessed of many degrees of freedom and capable of deforming in many ways, the face has been difficult to simulate accurately enough to pass the animation Turing test--fooling the average person into thinking a piece of computer animation is actually an image of a real person.
Examples of previous work in facial animation are discussed in Lee, Y., Terzopoulos, D., and Waters, K., "Realistic modeling for facial animation". Computer Graphics 29, 2(July 1995), 55-62; Waters, K., "A muscle model for animating three-dimensional facial expression," in Computer Graphics (SIGGRAPH '87 Proceedings)(July 1987), M. C. Stone, Ed., vol. 21, pp. 17-24; and Cassell, J., Pelachaud, C., Badler, N., Steedman, M., Achom, B., Becket, T., Douville, B., Prevost, S., and Stone, M., "Animated conversation: Rule-based generation of facial expression, gesture and spoken intonation for multiple conversational agents," Computer Graphics 28, 2(Aug. 1994), 413-420. These approaches use a synthetic model of facial action or structure, rather than deriving motion from real data. The systems of Lee et al. and Waters et al. are designed to make it relatively easy to animate facial expression manually. The system of Badler et al. is designed to create a dialog automatically rather than faithfully reconstruct a particular person's facial expression.
Other examples of facial animation include work by Williams and Bregler et al. See Williams, L., "Performance-driven facial animation" Computer Graphics 24, 2(Aug. 1990), 235-242 and paper by Bregler, T., and Neely, S., "Feature-based image metamorphosis" in Computer Graphics (SIGGRAPH '92 Proceedings)(July 1992), E. E. Catmull, Ed., vol. 26, pp. 35-42. Williams uses a single static texture image of a real person's face and tracks points only in 2D. Bregler et al. use speech recognition to locate "visemes" in a video of a person talking and then synthesize new video, based on the original video sequence, for the mouth and jaw region of the face to correspond with synthetic utterances. The visual analog of phonemes, a "visemes" consist of the shape and placement of the lips, tongue, and teeth that correspond to a particular phoneme. Bregler et al. do not create a three dimensional face model, nor do they vary the expression on the remainder of the face.
An important part of creating realistic facial animation involves the process of capturing an accurate 3D model. However, capturing an accurate 3D model solves only part of the problem--the color, shading, and shadowing effects still need to be captured as well. Proesmans et al. have proposed a one-shot 3D acquisition system for animated objects that can be applied to a human face. See Proesmans, M., Van Gool, L., and Oosterlinck, A., "One-Shot Active 3d Shape Acquisition," Proceedings 13.sup.th IAPR International Conference on Pattern Recognition, Aug. 25-26, 1996, vol. III C, pp. 336-340. Their approach uses a slide projector that projects a regular pattern on a moving object. The pattern is detected in each image of a video sequence taken of the moving object with the pattern applied to it. The shape of the moving object is then derived from the detected pattern by assuming a pseudo-orthographic projection of the pattern on the object.
Though prior attempts have made strides, more effective methods are needed to create realistic and efficient models of the complex structure, color, and shading of facial expressions and other complex real world objects.