1. Field of the Invention
The present invention relates to a method and an apparatus for data processing, recognizing an object represented as a two-dimensional image.
2. Description of the Related Art
Conventionally, studies have been made to recognize an object included in a two-dimensional image, particularly to recognize who it is from a person's face. A face included in an arbitrary obtained two-dimensional image is not always a full face, and the direction of the face may be upward, downward, left or right. As the pose of the face differs image by image, the appearance of the face included in the two-dimensional image also differs. Further, the appearance of the face included in the two-dimensional image differs when the amount of light reflected from the face differs portion by portion, because of different status of light source, for example, used for picking-up the two-dimensional image. In this manner, as the pose of the face changes, or as the state of the light source used for image pick-up changes, the appearance of one's face considerably changes. Therefore, it is impossible to correctly recognize a person included in a two-dimensional image by a simple method of pattern matching.
As means to solve this problem, a model face method has been proposed, in which a face image is represented as a model, and a model parameter is estimated from an input image. Chang Seok CHOI, Toru OKAZAKI, Hiroshi HARASHIMA and Tsuyoshi TAKEBE, “Basis Generation and Description of Facial Images Using Principal-Component Analysis,” Journal of Information Processing Graphics and CAD, August 1990, vol. 46, No. 7, pp. 43-50 proposes a first example, in which three-dimensional shape of a face and color information position by position of the face (texture) are subjected to principal-component analysis, and a linear sum of the resulting base shape and the texture is used as a method of describing the face image. According to the first method of description, the influence of head-rotation on the face can be eliminated, and thus a realistic face image can be created.
Volker Blanz et al., “A Morphable Model For Synthesis of 3D Faces”, SIGGRAPH 99 describes a method of recovering a three-dimensional shape from one two-dimensional image, using a linear sum model of a range data and RGB image data in a cylindrical coordinate system, as a second example of the method of describing a face image.
T. F. Coot et. al., “Active Appearance Models”, Burkhardt and Neumann, editors, Computer Vision-ECCV '98, Vol. II, Frieburg, Germany, 1999 describes a method in which a two-dimensional shape of a face and the texture of the face are subjected to principal-component analysis, respectively, and the resulting base and the model parameter are used for model representation of the face, so as to estimate model parameters from the two-dimensional input image, as a third example of the method of describing the face image. As a relation of a residual between an input image and an estimated model with a modification vector of the model parameter is learned in advance, high speed optimization is realized.
In the first example, however, the method of generating a parameter by the method of description from the input face image is not described. By a simple input of the two-dimensional image, it is impossible to recognize a face image contained therein.
In the second example, the input two-dimensional image and the linear sum model are compared, by finding the residual on the input image plane, and therefore, difference in the shape of the face is not well reflected. Accordingly, even when the linear sum model and the shape of the person in the two-dimensional image are different, sometimes the residual between the two-dimensional image and the image obtained by projecting the linear sum model onto the two-dimensional image becomes small, resulting in higher possibility of erroneous recognition. As the input two-dimensional image provides information of only one plane of a face as a solid body, estimation of the initial value of the parameter is difficult. Further, as the steepest gradient method is used as a method of optimization, it takes time until convergence is attained.
In the third example, the represented model is two-dimensional, and therefore, the three-dimensional shape of the face contained in the input two-dimensional image is not considered. Therefore, it has been difficult to determine whether the difference between the face contained in the two-dimensional image and the model face comes from the difference of pose or not.