1. Field of the Invention
The present invention relates to an image processing apparatus and an image processing method that detect a specific object from an input image.
2. Description of the Related Art
Methods for detecting a specific object from an input image include one proposed by Viola and Jones (see P. Viola and M. Jones, “Robust Real-time Object Detection”, SECOND INTERNATIONAL WORKSHOP ON STATISTICAL AND COMPUTATIONAL THEORIES OF VISION, Jul. 13, 2001). According to an algorithm that implements this method, a rectangular small area (hereinafter referred to as “a sub window”) is extracted from an input image, and it is determined whether or not a human face is included in the sub window. A description will be given of a determination method with reference to FIG. 7.
FIGS. 7A to 7C are explanatory diagrams of a process for determining (detecting) a specific object (face-detecting process) by a determination processing section of a conventional image processing apparatus.
A determination processing section 700 has a configuration in which a plurality of determination devices 70 (70-1 to 70-n) are cascaded, and each determination device determines (detects) that there is a high possibility that the image is a face, or that the image is not a face. The fact that a face is detected by the determination processing section 700 means that it is determined by all of the determination devices that there is a high possibility that the image is a face.
For example, as shown in FIG. 7B, if a sub window 701 which contains a face in a small area thereof is input, the processing by the determination processing section 700 proceeds as indicated by a route 702. The route 702 shows that it is determined by a first determination device 70-1 that the sub window 701 is True (“True” indicates that it is determined that there is a high possibility that the image is a face). Then, it continues to be determined by all of the determination devices from a next determination device 70-2 to a last determination device 70-n that the sub window 701 is True, whereby it is determined that the sub window image 701 contains a face.
On the other hand, as shown in FIG. 7C, if a sub window 703 containing no face in a small area thereof is input, the processing by the determination processing section 700 proceeds as indicated by a route 704. The route 704 shows that it is determined by the first determination device 70-1 that the sub window 703 is False (“False” indicates that it is determined that the image is not a face). By this determination, it is determined that the sub window 703 does not contain any face.
The above-described process indicates that if it is determined by a determination device in a preceding step that the image is not a face, processing to be executed by determination devices in following steps can be omitted. Therefore, the number of steps to be executed for processing of a background part is reduced, which takes much load off the image processing apparatus.
However, since the processing of a face part for the determination is performed by all of the determination devices, the number of steps to be executed is large, which increases load on the devices. Further, also in the vicinity of the face part, there is a tendency that it is determined that the image is not a face in one of the subsequent steps, which makes the load heavier than that on the processing of the background part.
As an example of implementation of the above-described algorithm by hardware, there has been proposed a method of cascading data-processing circuits that perform processes associated with one or a plurality of determination devices. In this method, a buffer for data queue is provided between each adjacent pair of data-processing circuits to prevent input from the preceding step to the following step from being interrupted.
It is envisaged that the size of the buffer is determined when designing the determination processing section 700 such that a sufficient throughput can be ensured even in a state in which the largest load is expected, or such that demanded processing time is ensured.
FIG. 8 is a flowchart of a sub window position control process executed by the conventional image processing apparatus.
A description will be given of a conventional sub window position control method for determining a position of a sub window extracted from an input image with reference to the flowchart in FIG. 8.
In a step S801, a vertical position of a sub window to be extracted is initialized so as to set a start position of the sub window. In a next step S802, a horizontal position of the sub window is initialized so as to set the start position of the same. These steps form an initialization phase.
FIG. 9 is an explanatory diagram showing the movement of the sub window position during the sub window position control process executed by the conventional image processing apparatus.
As an upper left coordinate position of a sub window 902 in an input image 901 to be extracted, a horizontal position of the upper left coordinate position is denoted by Ph, and a vertical position of the same is denoted by Pv, which are respectively initialized to 0. The position of the sub window 902 in the input image 901 shown in FIG. 9 is the initial position of the sub window.
Next, in a step S803, Ph and Pv are respectively assigned to an output horizontal position Outh and an output vertical position Outv, as outputs of the position of the sub window 902 acquired from the input image 901.
Next, in a step S804, the horizontal position Ph is incremented by 1 to update the same so as to move the sub window by one pixel in the horizontal direction. The sub window moved from the position of the sub window 902 by one pixel in the horizontal direction is a sub window 903.
Next, in a step S805, it is determined whether or not the sub window has reached a horizontal end position. In the case of the position of the sub window 902, the answer to the question of the step S803 is NO, so that the process returns to the step S803 to repeat the above-described processing, whereby the sub window is sequentially moved to sub windows 903, 904, et seq. in the horizontal direction.
The sub window is thus sequentially moved in the horizontal direction, and finally the position of a sub window 905 as the horizontal end position is reached. This makes the answer to the question of the step S805 affirmative (YES), thereby terminating the loop of the processing in the horizontal direction. In the following step S806, to shift the sub window by one pixel in the vertical direction, the vertical position Pv is incremented by 1 to update the same.
Next, in a step S807, it is determined whether or not the sub window has reached a vertical end position. Since the position of the sub window 905 is not the vertical end position, the answer to this question is NO, so that the process returns to the step S802 to initialize the horizontal position Ph, which causes the sub window to be placed in the position of the sub window 906.
Thereafter, the movement processing is repeated in the horizontal and vertical directions. When the sub window reaches the position of a sub window 907 at the vertical end position, the answer to the question of the step S807 becomes affirmative (YES), thereby terminating the loop.
Although this is the conventional scanning sequence, the scanning sequence illustrated in FIG. 8 is the same as that of so-called raster scan.
However, the conventional algorithm with which a specific object is detected from an input image has a feature of tendency that sub windows in the vicinity of a specific object each continue to be determined to have a high possibility that the image is of a specific object, up to steps closer to the final step.
Therefore, according to the scanning sequence as illustrated in the sub window position control process in FIG. 8, since the sub window is moved, pixel by pixel, in the horizontal direction, as the sub window is closer to a face to be detected, the processing is more liable to proceed to steps closer to the final step. As a result, heavy load processing is continued, which increase load per unit time.
FIG. 10 is a timing diagram of a sub window image process executed by the conventional image processing apparatus.
A description will be given of why the performance of the apparatus is degraded when the heavy load processing is continued with reference to the timing diagram in FIG. 10. In FIG. 10, the horizontal axis represents time, and the vertical axis represents sub window images and the types of the images, which are sequentially input according to the scan order. Further, the order of arrangement of the sub window images on the vertical axis from the top corresponds to the order of inputting of them from the start.
First, it is assumed as a precondition that determination devices for determining whether or not the possibility that an input image is a face is high are implemented by data-processing circuits (hereinafter referred to as “the stages”). The stages are a stage 1 (S1 in FIG. 10) and a stage 2 (S2 in FIG. 10) which cascaded, and the stage 2 is assumed to require more processing time than the stage 1.
If a sub window image containing a face (face image) is transmitted to such a system, data-processing is performed by the stages 1 and 2 in the mentioned order, and if a sub window image containing no face (non-face image) is transmitted to the system, the data-processing is performed only by the stage 1. If face images 1 to 3 are continuously input, this leads to a state in which the face images 2 and 3 are caused to wait before processing of a preceding face image is completed by the stage 2.
To enable images to wait for processing by the stage 2, it is required to provide buffers between the stages. However, in a case of processing on a sub window-basis, it is necessary to buffer the data for each sub window, and this increases the amount of data to be buffered increased. As a consequence, the size of each buffer circuit becomes so large that it is difficult to implement the determination processing section 700 by hardware.