1. Field of the Invention
The present invention relates to an image processing apparatus, an image processing method, an image processing program, an image capturing apparatus and a controlling method thereof that detect a face region from an image.
2. Description of the Related Art
Recent digital still cameras that mainly capture still images and digital video cameras that mainly capture moving images are equipped with functions for detecting the face of a person to be captured, using the detected result for auto focus (AF), auto exposure (AE), and auto white balance (AWB) controls, and adjusting the skin color of the captured image.
To accurately use such functions, it is necessary to accurately detect the position and size of the face in addition to provide high detection rate of the face. If the position and size of the face are difficult to be accurately detected, a non-face image such as a part of the background may appear at the detected portion. As a result, the non-face image adversely affects calculation results of parameters of the AF, AE, AWB, and color adjustment controls.
For example, the patent document disclosed as Japanese Patent Application Laid-Open No. 2005-157679 (referred to as Patent Document 1) describes a method of determining whether or not the object to be captured is a face based on a likelihood value that represents likelihood of a face.
On the other hand, the patent document disclosed as Japanese Patent Application Laid-Open No. 2004-30629 (referred to as Patent Document 2) describes a method of scanning an image while gradually reducing its size and detecting the position and size of the face from the image.
In Patent Documents 1 and 2, by moving a square region having a predetermined size by Δx in the horizontal direction and by Δy in the vertical direction from the upper left end to the lower right end of the captured image, it is scanned as shown in FIG. 4 such that the position of a face is obtained. In addition, as shown in FIG. 3, by scanning images that are gradually reduced by predetermined reduction rate Δr, the size of the face is obtained.
Thereafter, by calculating all differences of luminance values of pixels of pre-learnt two points of all pixels of the square region having the predetermined size, likelihood values that represent likelihood of a face are obtained based on the calculated results. A threshold value based on which it is determined whether or not the object is a face is preset for likelihood values of square region images. If a calculated likelihood value is equal to or larger than the threshold value, it is determined that the object be a face. In contrast, if the likelihood value is smaller than the threshold value, it is determined that the object be not a face.
As shown in FIG. 5, if an image is scanned in the horizontal direction, the closer the square region image is to the X coordinate of the face, the larger the likelihood value of the real face is, the farther the square region image is from the X coordinate, the smaller the likelihood value of the real face is. In this example, since the likelihood values of square regions F2, F3, and F4 are equal to or larger than the preset threshold value, they are determined to be faces. On the other hand, since the likelihood values of square regions F1 and F5 are smaller than the threshold value, they are determined not to be faces.
In addition, as shown in FIG. 6, if an image is scanned in the vertical direction, the closer the square region face is to the Y coordinate of the face, the larger the likelihood value of the real face is, the farther the square region image is from the Y coordinate, the smaller the likelihood value of the real image is. In this example, since the likelihood values of square regions F12, F13, and F14 are equal to or larger than the threshold value, they are determined to be faces. In contrast, since the likelihood values of square regions F11 and F15 are smaller than the threshold value, they are determined not to be faces.
In addition, as shown in FIG. 7, if an image is reduced, the closer the reduced size of the image is to the real face size, the larger the likelihood value of the real face is, the more different the reduced size of the image is from the real face size, the smaller the likelihood value of the real face is. In this example, since the likelihood values of square regions F22, F23, F24, and F25 are equal to or larger than the threshold value, they are determined to be faces. On the other hand, since the likelihood value of square region F21 is smaller than the threshold value, it is determined not to be a face.
Thus, in the method of detecting an object from an image and determining whether or not the object is a face, by scanning an image, a plurality of faces having different likelihood values are detected for one face from the image. Consequently, it is necessary to extract a correct face from the detected faces. To extract a correct face, a method of selecting a face having the largest likelihood value from the detected faces may be used.