1. Field of the Invention
The present invention relates to an image recognizing apparatus, an image recognizing method, and a program for the image recognizing method. In particular, the present invention relates to a technique which is suitably used to detect a specific subject such as a person, an automobile or the like or a part thereof from an image.
2. Description of the Related Art
A technique of detecting a specific subject image from a general image is widely applied to various fields such as image search, object detection, object recognition, object tracing and the like. As an example of the technique like this, a method of particularly detecting a face area from a general image has been proposed (see P. Viola and M. Jones, “Robust Real-time Object Detection” SECOND INTERNATIONAL WORKSHOP ON STATISTICAL AND COMPUTATIONAL THEORIES OF VISION, Jul. 13, 2001). In this method, a rectangular small area (hereinafter, called a detection window) is first extracted from an input image, and it is discriminated whether or not a face is included in the detection window. Here, such discrimination is performed by passing the detection window through a discriminator which is constituted by cascade-connecting strong discriminators. In a case where the detection window is discriminated as a subject by all the strong discriminators, a result indicating that the face is included in the detection window is output. Contrarily, in other cases, a result indicating that the fact is not included in the detection window is output.
On another front, as an effective method for detecting a human whole body area of which the shape fluctuation is larger than that of a face, there has been proposed a method of using as a feature quantity an HOG (Histograms of Oriented Gradients) in which a histogram of gradients in a rectangular area is provided for each direction (see N. Dalal and B. Triggs “Histograms of Oriented Gradients for Human Detections” (CVPR2005)). Incidentally, the human whole body area will be called a human body area in the following description.
In such techniques as described above, there is a problem that a physical body, a background or the like which is similar to the shape of a subject but is not actually the subject is erroneously detected. In particular, in case of detecting a human body, an area around the human body tends to be erroneously detected. For example, there is a case where a shoulder or a leg which is a part of the human body is erroneously detected as the human body area. This is because it is conceivable that the shape of the shoulder or the leg is similar to the shape of the human body (i.e., the shape of the laterally-facing human body). Moreover, when a person overlaps a background such as a tree(s), a mountain(s) or the like having a rounded shape upwardly, an area including not only the person but also the background is erroneously detected as the human body area. This is because it is conceivable that the shape obtained by combining the tree(s) or the mountain(s) with the person is similar to the shape of a human body.
A human body area often exists in the vicinity of such erroneous detection, and the relevant human body area is correctly detected. For this reason, a result obtained by correctly detecting the human body area and a result obtained by erroneously detecting the area other than the human body area often overlap each other. Under such a situation, Japanese Patent Application Laid-Open No. 2010-176504 has proposed a method of, when there are overlapping detection results, comparing the likelihoods of these results and selecting the detection result having the higher likelihood.
However, the above method is not suitable for a case where both the overlapping detection results are directed to the human body areas. For example, this method is not suitable for a case where a child stands in front of an adult, a case where, although two persons seem to stand side by side, one person actually stands at a distant place behind the other person, and the like. In any case, when the above method is applied to such circumstances, there is a fear that the human body area correctly detected is deleted from detection candidates as an erroneous detection result.