Conventionally, in order to obtain quantitative motion of a capturing object on the basis of an image (captured image) of a human face or others captured by a monocular camera, a simplified model representing the capturing object was hypothetically generated to obtain motion of the model from motion on the captured image.
For example, to obtain quantitative motion of a human face from a captured image of the human face, models as illustrated in FIG. 7 were used, which are a model representing a human face in planar form (see Non-patent document 1), a model representing a human face in cylindrical form (see Non-patent document 2), and a model representing a human face in ellipsoid form (see Non-patent document 3).
Also, in obtaining quantitative motion of the capturing object on the basis of the image captured by the monocular camera as in the above-mentioned method, a camera model was simplified by orthographic transformation or weak perspective transformation, for example.
Further, there can be a case where quantitative motion of the capturing object on the basis of an image captured by a stereo camera, instead of a monocular camera (see Non-patent document 4). In this case, it is possible to measure the position and posture of the capturing object with accuracy by directly fitting in three dimensions (a) three-dimensional coordinate values of the capturing object, which values are obtained from an image captured by the stereo camera, and (b) a three-dimensional model of the capturing object.
[Non-Patent Document 1]    M. J. Black and Y. Yacoob. Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motions. ICCV, 1995
[Non-Patent Document 2]    M. L. Cascia, S. Sclaroff and V. Athitsos: “Fast, Reliable Head Tracking under Varying Illumination: An Approach Based on Registration of Texture-Mapped 3D Models”, IEEE PAMI, vol. 22, no. 4, April 2000.
[Non-Patent Document 3]    S. Basu, I. Essa, A. Pentland: “Motion Regularization for Model-Based Head Tracking”, Proceedings of the International Conference on Pattern Recognition (ICPR '96) Volume III-Volume 7276, p. 611, Aug. 25-29, 1996
[Non-Patent Document 4]    Yoshio Matsumoto, Alexander Zelinsky: “An Algorithm for Real-time Stereo Vision Implementation of Head Pose and Gaze Direction Measurement”, Proceedings of IEEE Fourth International Conference on Face and Gesture Recognition (FG '2000), pp. 499-505, 2000
[Non-Patent Document 5]    Tomasi, Kanade: “Shape and motion from image streams: a factorization method,” Technical Report CMU-CS-91-132, CMU, 1991