Generation of animated characters from living beings is traditionally accomplished using either audio input, such as speech, from a subject or visual input, by tracking facial movement of the subject. Typically, speech from a human is captured and recorded and voice samples, known as phonemes, are extracted from the speech input. Phonemes are sounds within a spoken language, such as the “b” and “oo” in “book” in English. From these basic sounds, an animated character can be manipulated to mouth the speech, and thereby emulate a human speaker.
In other prior art techniques, video is captured and recorded and visual samples, known as visemes, are extracted from the captured video. Visemes are visual samples that correspond to facial features, such as mouth, teeth and tongue positions, when pronouncing phonemes. The visemes can then be stored in a database so that phonemes can be matched to a corresponding viseme.
By matching visemes with corresponding phonemes and morphing consecutive visemes together, an animated character can be generated to emulate a human face during speech. An example of a prior art technique for generating facial animation from a human is illustrated in FIG. 1. One problem with the technique illustrated in FIG. 1 is that the actual expression of the person whose face is being modeled by the animated face is not portrayed in the animated face. Therefore, there's no variance in character expression for the same phonemes.
Another prior art technique involves tracking regions of an object during object movement and generating a corresponding animated object by mapping points within the tracked regions from the real object to the animated object. Features of a human face, such as the mouth, make certain shapes while the person is talking. In one prior art technique, points in the mouth region are tracked and mapped onto the final animated face. One problem with this technique is that sporadic errors in tracking and/or mapping input points to the animated character can cause noticeable distortion in the facial expression of the animated character.
Tracking and recognizing facial motions using parametric models of image motion is another technique for generating animated characters. These techniques typically model motions within facial regions rather than track individual feature points. One prior art technique uses affine models to model character facial motion. An affine model is a set of linear equations for modeling two-dimensional image motion. These equations can contain a number of parameters corresponding to motion, such as translation, rotation and scaling. Furthermore, affine models can use quadratic equations for expressing more complex motion types such as divergence and curl.
Another prior art technique is illustrated in FIG. 2. The technique illustrated in FIG. 2 tracks and recognizes facial motion using optical flow techniques to generate individual motion vectors corresponding to image points. These motion vectors may then be used to determine the final expression of the animated character.
Several prior art methods exist for generating an animated character from visual or speech input. These techniques, however, typically have shortcomings in their effectiveness of modeling real-time motion accurately and reliably.