There has conventionally been a technology for detecting a face area contained in image data captured by, for example, a digital camera. In general, when detecting a face area contained in image data, a matching process is performed with a prepared face template to detect the face contained in the image data.
Furthermore, a method has been proposed in which, when a plurality of faces of different sizes are present in image data to be a target of a face detection process, the resolution of the image data or the resolution of the prepared face template is changed in stepwise fashion to detect the plurality of faces of different sizes contained in the image data (for example, Patent Literature 1).
FIG. 10 is a conceptual diagram showing a face detection process in a general face detection apparatus. In FIG. 10, the face detection apparatus performs the face detection process on an input image 90 having a QVGA (Quarter Video Graphics Array) resolution. In order to detect a plurality of faces of different sizes contained in the input image 90, images having different resolutions are generated, based on the input image 90, in accordance with preset resolutions. Here, when generating the images having different resolutions, the input image 90 is reduced by 20% in stepwise fashion as shown in the following Mathematical Formula 1 to generate images 91, 92, 93 . . . having the respective resolutions.(1/1.25)n  (1)where n denotes a resolution ID indicating a reducing phase of the input image 90. A reduction rate of the input image 90 increases with an increase in the resolution ID. Since the input image 90 has the QVGA resolution (horizontally 320 pixels×vertically 240 pixels), the image 91 where the resolution ID=0 is equivalent to the QVGA resolution image. Next, an image where the resolution ID=1 is the image 92 (horizontally 256 pixels×vertically 192 pixels) that has 1/1.25th of the QVGA resolution, i.e., that has the reduction rate of 80% relative to the QVGA resolution image. Similarly, an image where the resolution ID=2 is the image 93 (horizontally 205 pixels×vertically 154 pixels) that has the reduction rate of 64% relative to the QVGA resolution image.
FIG. 11 is a diagram showing the relation between the resolution and the number of pixels in images to be generated having the respective resolutions. More specifically, FIG. 11 shows the relation between the resolution IDs which are obtained when the QVGA resolution image is reduced by 20% in stepwise fashion based on the aforementioned Mathematical Formula 1 to generate images having different resolutions, and the number of vertical and horizontal pixels in the images to be generated having the respective resolutions.
Here, when the matching process is performed by using, for example, a face template having horizontally 20 pixels×vertically 20 pixels, images are generated having the resolutions ranging from the resolution ID=0 to the resolution ID=11, which are obtained by reducing the QVGA resolution image by 20% in stepwise fashion, as shown in FIG. 11. Because the minimum resolution (the resolution ID=11) is horizontally 29 pixels×vertically 23 pixels, if the image is reduced any further, matching with the face template having horizontally 20 pixels×vertically 20 pixels cannot be performed. Therefore, the resolution ID ranges from 0 to 11. The phases of the resolution IDs are thus determined based on the size of the prepared face template.
In the above example of the conventional technology, by reducing the input image 90 by a predetermined reduction rate in stepwise fashion, the images 91, 92, 93 . . . having different resolutions are generated based on the input image 90. Subsequently, in order from an image having the largest reduction rate (resolution ID=11) toward an image having the small reduction rate (resolution ID=0), images to be the target of the face detection process are selected, and a portion of each selected image is clipped and matched with the prepared face template. That is, the face detection process is performed in stepwise fashion on the images from the smallest image towards the large image, thereby detecting a plurality of faces of different sizes contained in the input image 90 in order from the largest face.
In recent years, a camera system capable of high speed shooting has been proposed. The camera system allows photographs to be taken at over 300 frames per second and, consequently, a decisive moment can be captured. In order to detect, with high accuracy, a face in an image captured by the camera system, preferably the face detection process as described above is performed on all frames of image data captured at 300 frames per second. However, the requisite throughput for the face detection process undesirably increases in proportion to the number of frames to which the face detection process is applied.
In consumer electronics of recent years, reduction in power consumption has been highly desired. In general, throughput of equipment is proportional to its power consumption. Thus it is essential to reduce the requisite throughput for the face detection process while maintaining the accuracy of the face detection.
For example, Patent Literature 2 has proposed a method of reducing the amount of computations and shortening the processing time, per frame image, while maintaining the accuracy in detecting a subject in the frame image. In Patent Literature 2, a valid frame rate is logically obtained and a detection range is limited, in accordance with the size of the detection subject and a vehicle speed, thereby avoiding useless processes. The method takes advantage of a property that a small subject seen in the distance does not rapidly change in its size when being approached. That is, when detecting a small subject, the detection accuracy can be maintained even when the frame rate of the detection subject is reduced.
Also, for example, Patent Literature 3 has proposed a method of changing recognition conditions in response to that a specific face to be a target of recognition is recognized. In Patent Literature 3, when a specific face to be a target of recognition is recognized, the recognition conditions are changed at the time point, thereby performing the subsequent recognition processes at a high speed, without reducing the accuracy of the face recognition.