In a facial motion capture pipeline, the motion of an actor's face is used to drive an animated model of the actor's face. In one embodiment of the pipeline, an actor is filmed with multiple cameras and marks on his face in a motion capture session. Using a variety of processes, the marks are converted to animated 3D points in space. This information is then used to drive an animation rig or model created by artists from a high resolution scan of the actor's face. With the animation information of the 3D points in space, the actor's face can be animated in a life-like manner.
It is desirable for the actor to keep his head steady and stable during the motion capture session. When attempting to animate the animation model, facial motion capture ideally captures only actor facial motions, not head motions. Head movements are animated at a later stage in the animation process.
Unfortunately, it is difficult for an actor to act while keeping his head and face completely stable. Thus, every motion capture session includes removing the motion of the head from the captured motion of the individual marks on the actor's face. This process can be difficult because the face is constantly moving and deforming. Furthermore, there are no fixed landmarks on the human face that can be used to compute the overall affine transformation of the head.
Prior approaches included either a laborious hand positioned alignment by trained artists of the 3D points or by tracking the 3D location of points on the head that were deemed “stable enough.” Neither of these techniques produces consistent results. In the hand-stabilization approach, the results vary greatly depending on the skill of the artist and the time spent on the project. In the track-the-most-stable-markers approach, significant error results because no part of the face is completely stable.
Horn's orientation/alignment algorithm discusses a method for a relationship between two coordinate systems using pairs of measurements of the coordinates of a number of points in both systems and is described in Berthold K. P. Horn “Closed Form Solution of Absolute Orientation using Unit Quaternions,” Journal of the Optical Society of America, Vol. 4, pp. 629-42, April 1987, which is hereby incorporated in its entirety.
Random Sample and Consensus (RANSAC) discusses a method for fitting a model to experimental data that is capable of interpreting and smoothing data containing a significant percentage of gross errors and is described in Fischler, M. A. and Bolles, R. C. “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography” Readings in Computer Vision: Issues, Problems, Principles, and Paradigms, M. A. Fischler and O. Firschein, Eds. Morgan Kaufmann Readings Series. Morgan Kaufmann Publishers, San Francisco, Calif., 726-740, 1981, which is hereby incorporated in its entirety.
Thus, there is a need to compute the global head transformation during a motion capture session based on only the points on the face.