The technique of detecting the contour point of the eyes and mouth from a facial image has been actively researched in the prior art because it can be applied to pre-stage processing for face authentication and expression estimation, and to an application for generating a portrait and the like.
For example, Japanese Unexamined Patent Publication No. 9-6964 (published Jan. 10, 1997) (Patent Document 1) describes a technique of setting a search range of the eye, mouth, and the like with a center point of the eye, mouth, and the like specified by a user as a center, scanning the set search range, and extracting an eye region, mouth region, and the like based on a color component and the like. Patent Document 1 also describes identifying left and right end points of the extracted eye region, mouth region, and the like, setting a search range for searching upper and lower end points of the eye region, mouth region, and the like based on the left and right end points, and extracting the upper and lower end points.
Japanese Unexamined Patent Publication No. 2005-339288 (published Dec. 8, 2005) (Patent Document 2) describes fitting a dynamic contour model based on a reference point, the reference point being left and right end points of an eye, and extracting the contour point of the eye by energy minimization when extracting the contour point of the eye.
A method of detecting the contour point of the eye and the mouth from a facial image includes a fitting method based on a shape model and a texture model. Specifically, the fitting method of ASM (Active Shape Model), AAM (Active Appearance Model), ASAM (Active Structure Appearance Model), and the like described in T. F. Cootes, et al., “Active Shape Models—Their Training and Application”, CVIU, Vol. 6, No. 1, pp. 38-59, 1995 (non-Patent Document 1), T. F. Cootes, et al., “Active appearance models”, ECCV '98 Vol. II, Freiburg, Germany, 1998 (non-Patent Document 2) and Japanese Patent Publication No. 4093273 (issued Jun. 4, 2008) (Patent Document 3), Japanese Patent Publication No. 4501937 (issued Jul. 14, 2010) (Patent Document 4) is known.
The shape models of ASM, AAM, and ASAM are models that express the shape and texture of the face with a few parameters. These models are obtained in such a way that a main component analysis is applied on face feature point coordinate information and the texture information, and the feature point coordinates of the face are expressed with only a base vector having a large eigenvalue among the base vectors obtained as a result. Accordingly, not only can the shape of the face be expressed with few data, but restraining conditions for maintaining the shape of the face can be provided. This model is fitted to a facial image by energy minimization in the ASM and the AAM and by model parameter error calculation in the ASAM to detect the feature point coordinates of the face.
The expression of the face changes in various ways and has a wide range of variations according to the shape of the mouth, shape of the eye, combination thereof, and the like. Thus, it is difficult to predict all shape states of an object such as eye, mouth, and the like that change into various shapes. In the prior arts described above, therefore, it is difficult to detect, with high accuracy, the contour point of the object the shape of which greatly changes, such as the contour point of the eye and the mouth, and the like.
Specifically, in the technique described in Patent Document 1, the contour point cannot be correctly detected if the shape of the eye, mouth, or the like changes beyond assumption and the contour point of the eye, mouth, or the like is not within the search range. If the search range is set wide to cover the various shapes of the mouth and shapes of the eye, the processing load becomes very large since the contour point is detected by scanning the search range in the technique described in Patent Document 1. It is thus unpractical to set the search range wide in the technique described in Patent Document 1. Therefore, it is difficult to detect, with high accuracy, the contour point of the object the shape of which greatly changes, in the technique described in Patent Document 1.
Furthermore, in the technique described in Patent Document 2, it takes a very long time to extract the contract point of the object or the contour point cannot be correctly extracted if the shape of the object is significantly different from the dynamic contour model being used. If various models are prepared to cover the various shapes of the mouth and shapes of the eye, the accuracy in extracting the contour point is enhanced, but the size of data to be stored in the device and the processing load may be large. Thus, it is unpractical to prepare various models in the technique described in Patent Document 2. Therefore, it is difficult to detect, with high accuracy, the contour point of the object the shape of which greatly changes, in the technique described in Patent Document 2.
The ASM and the AAM have a defect in that a great amount of calculation time is required for the search processing. The AAM also has a problem in that the shape model for each individual needs to be prepared and the fitting accuracy with respect to the face of somebody else is low.
The ASAM realizes high speed and high accuracy with respect to the ASM and the AAM. The ASAM assumes the shape of the face as the restraining condition with respect to the face in which the expression change is small, so that highly accurate detection results can be obtained. The ASAM, however, cannot detect with high accuracy the expression in which the open/close state and the shape state of the mouth, eye, and the like greatly changes. This is because the shape model of the face used in the ASAM is a global model that expresses the shape of the entire face, and cannot make accurate expression on the changes for each region such as the eye and the mouth, for example, the open/close state and the shape state.
In light of the foregoing, it is an object of at least one embodiment of the present invention to realize an image processing device for detecting the shape of an object on an image with high accuracy, even with respect to the object that changes to various shapes, an information generation device, an image processing method, an information generation method, a control program, and a recording medium.