As a general technique of constructing and estimating a 3D shape of an object from its 2D image, a technique disclosed in Japanese Unexamined Patent Publication No. 1999-242745 is known, for example. According to the technique, stereo/multi-lens images taken by two or more cameras or a pattern irradiated image in which a known pattern irradiated with visible light or infrared light is taken are used. In a case where a shape of the object is limited, the 3D shape of the object can be estimated from one image by utilizing the limitation in the shape. For example, in a case where a vertical line intersects a horizontal line at right angles as in a building or in a case where a specific pattern such as a repeated pattern is drawn on a plane, the 3D shape of the object can be calculated by using vanishing point principle and geometric information such as compound ratio. However, in a case of a “face”, that is, as for an object having no formulaic and geometric limitation in shape such as a plane or a sphere and no specific pattern in color/brightness, the above-mentioned general stereo/multi-lens images and the pattern irradiated image are used.
However, according to such methods, measurement apparatuses such as a plurality of cameras (stereo/multiple-lens camera), a pattern irradiator for irradiating pattern light, a camera for detecting light such as the infrared light and the like, which are not used in normal photographing, are necessary. This causes problems such as an increase in costs and a limitation in measurement environment. Furthermore, it is necessary to store information at the time of the measurement such as a position of the camera and an irradiating position of the pattern irradiator at the time of photographing. This causes problems that the measurement environment is limited and only an image previously taken in the environment dedicated to the measurement can be used. The “face” is often taken as a single image with disregard to the measurement. Consequently, according to the conventional technique, it is impossible to estimate a 3D shape of the face from the single face image.
In this connection, according to a technique disclosed in Japanese Unexamined Patent Publication No. 2001-84362, an object with a 3D shape including a part having a substantially constant surface reflection rate is photographed under a substantially single light source. A direction of the light source and the 3D shape of the object are estimated based on the image obtained by the photographing. A face classification system disclosed in Japanese Unexamined Patent Publication No. 1993-266173 has a first means for locating a face in a 2D display of a 3D frame, a second means for detecting the face in the display, a third means for generating a feature vector of the face and a fourth means for comparing the feature vector of the face detected this time with a feature vector of the face detected previously and thereby determining whether or not the face detected this time corresponds to the face detected previously. International Publication No. WO-02/007095-A1 discloses a face 3D orientation tracking device for sequentially estimating an orientation of a person's face from input images taken in a time-series fashion.
Although the face does not have formulaic and geometric shape such as a plane or a sphere, a schematic arrangement and a topological shape of feature points such as eyes, a mouse and the like are the same. Individual various facial shapes can be obtained based on a degree of deviation of the feature point arrangement from a standard face. Moreover, since photographed face images are overwhelmingly full-faced images, there is a limitation in the topological shape with respect to the face image. From that point of view, a 3D face shape is estimated from a single face image by using a learning model obtained from 3D face shapes of a plurality of persons (refer to T. Vetter et al., “A Morphable Model For The Synthesis of 3D Faces”, ACM Conf. SIGGRAPH 99, pp. 187-194, 1999). However, according to the method, it is necessary to manually designate feature points in the face image in order to match the face image to the learning model.