In recent years, some imaging apparatuses such as digital cameras are configured to detect from the image being captured the region of a person or a face and to display the region surrounded by a frame (hereinafter, called an object detection frame) (refer to, for example, Patent Literature (hereinafter, abbreviated as “PTL”) 1).
Displaying an object detection frame enables a user to instantaneously judge where in the image of a subject a target such as a person or face (hereinafter sometimes called a detection target object) is located, and allows the user to smoothly perform an operation such as disposing the target in the center of the image being captured. In an imaging apparatus that performs automatic focus (AF) or automatic exposure (AE) control at a surrounded target, the user can also verify the region in which the focus point or exposure is adjusted, based on the object detection frame.
In this case, of course, displaying of an object detection frame requires art for detecting an object. PTL 2 describes art for detecting a face in an image being captured. In PTL 2, an indicator value (score) of similarity between sample face images determined by pre-learning and the image to be captured is calculated, and an image region in which the indicator value is at least a threshold is detected as a candidate region for a face image. Actually, because a plurality of candidate regions are detected in the area surrounding the same face image, that is, because a candidate region group is detected, in PTL 2, further threshold judgment of these candidate regions is performed to integrate candidate regions of one and the same face image.
Combining the object detection frame described in PTL 1 and the object detection described in PTL 2, the following object detection window display processing is performed.
Specifically, first, raster scanning of the input image using an object detector forms object detection frame candidates around a target object. Next, integrating object detection frame candidates in proximity to one another forms and displays the ultimate integrated frame. Specifically, grouping is done while using the scores and the like of detection frame candidates, and grouped detection frame candidates in proximity to one another are integrated and displayed. As a result, an object detection frame surrounding the target object (ultimate integrated frame) is displayed.