1. Field of the Invention
The present invention relates to an object detection technique, and more particularly, to a technique for detecting a predetermined object from input information.
2. Description of the Related Art
In an object detection apparatus that detects an object included in input information, the object can desirably be detected even when the posture of the object changes and when the object is partially shielded. To deal with various states of the object such as the change in the posture and the shielding, detecting the object using a plurality of different detectors is effective.
A technique for detecting an object using a plurality of different detectors has conventionally been proposed. In a document entitled “Improved Part-based Human detection Using Depth Information” by Takayoshi Yamashita, Sho Ikemura, Hironobu Fujiyoshi, and Yuji Iwahori, The transactions of the Institute of Electrical Engineers of Japan. D, Vol. 131, No. 4 (2011) (hereinafter referred to as Document 1), a face detector and an upper body detector are combined, to perform human detection according to a change in a direction of a person and partial shielding of the person. An advantage of combining the face detector and the upper body detector will be specifically described below. The face detector can detect a face with a high performance because various methods have been developed. If the face is seen, the person can be detected with a high probability. When the human detection is performed using only the face detector, however, the face becomes difficult to see depending on the direction of the person, so that the person becomes difficult to detect. In the face detector, if the size of the person in the image is decreased, information about a face texture is decreased, so that the person becomes difficult to detect. On the other hand, the upper body detector can detect the upper body of the person in a standing posture regardless of the direction of the person. If a part of the upper body is shielded, however, the detection performance of the upper body detector deteriorates. In Document 1, the face detector and the upper body detector are combined, to compensate their respective disadvantages for each other to detect the person.
If the object is detected using the plurality of different detectors, different detection results need to be merged to output one detection result for one person. At this time, the issue is how the different detection results are merged. Particularly, the issue is a merging method performed when one or more persons exist adjacent to one another and overlap one another. If results of the upper body detector and the face detector are merged, for example, detection results that greatly overlap each other are simply merged and are output as a result of the same person. When a plurality of persons overlap one another, a face detection result 1202 of the person behind may be merged with an upper body detection result 1201 of the person in front, as illustrated in FIG. 1. As a result of this, a result obtained by detecting only the person in front is output in a final result 1203 regardless of the person behind being detected by the face detector.
In Document 1, to solve this issue, a face position is estimated from a detection result of the upper body detector, and is combined with a detection result position of the face detector, to find a cluster center of the detection result by “mean shift”. By this processing, results of a plurality of detectors for detecting different sites are merged.
In Document 1, a face position is estimated from the detection result of the upper body detector. However, the face position is estimated from a detection result of the upper body, so that the face position tends to be lower in reliability than the face position represented by the face detection result. Since the estimated face position low in reliability and the face position represented by the face detection result relatively high in reliability are simply merged with each other, the face position to be finally output may be output to an erroneous position. In Document 1, even when the entire upper body is seen regardless of using the upper body detector, a range of the upper body cannot be specified.