1. Field of the Invention
The present invention relates to an apparatus and a method for detecting an object from an image, and a program.
2. Description of the Related Art
As a technique for detecting an object from an image captured by a camera, some techniques have been discussed as follows. One is a method for detecting a moving object using a background difference method. In the background difference method, an image without an object is captured by a camera fixed in an immovable position and registered as a background image in advance. When an object is detected, a difference between the image captured by the camera and the background image, and difference area is obtained as a moving object. Recently, an improved technique of the background difference method has been discussed.
For example, in a technique discussed in Japanese Patent No. 3729562, a differential image of a background model is generated using different resolutions or different spatial frequency bands, and according to a difference image between the differential image of the background model and a differential image of an input image, it is determined whether the object is an object of interest. Thus, the differential images are utilized for robust determination with respect to variation in brightness due to a change in illumination.
In a technique discussed in Japanese Patent Application Laid-Open No. 2001-014474, after the background difference is calculated, whether a pixel belongs to the foreground or the background is determined by using a majority filter at the boundary between the foreground and the background. The majority filter is designed to determine a state of a target pixel according to the majority of states of neighboring pixels of the target pixel. Accordingly, an image including only a moving object is generated which is free of background noise or “moth-eaten” regions. In a technique discussed in Japanese Patent Application Laid-Open No. 2006-18658, in moving object detection, an unnecessary moving object is removed by calculating a logical product of a plurality of binary background difference images or expanding the binary background difference images.
An example of a technique for detecting an object, such as a face or a person, from an image is discussed in United States Patent Publication Application No. 2007/0237387, for example. The technique is directed to determining whether an input pattern is a human body or not by evaluating the features of Histograms of Oriented Gradients (HOG). The features of Histograms of Oriented Gradients are that a gradient magnitude and a gradient direction are obtained for each pixel and sums of the gradient magnitude of each pixel in rectangular areas referred to as cells are calculated for different gradient directions. Since the gradient magnitude depends on the contrast in the image or the like, the gradient magnitude is normalized by averaging a total sum of the gradient magnitudes in a rectangular area referred to as a block.
For the cell and the block, an area effective for determination is selected based on AdaBoost learning from among various positions and sizes on patterns. A discriminator is used for determining whether an object is a human body or not based on the features of Histograms of Oriented Gradients. A plurality of the discriminators is connected in series. Only when the object is determined as a human body, a subsequent-stage discriminator performs determination, and thus the high speed processing can be realized.
However, the technique discussed in the Japanese Patent No. 3729562 is not robust to a movement of the background. For instance, if the background partially shakes by wind, it is impossible to obtain good gradient difference images. In addition, since the background model and an input image have the same gradients in the background areas, the gradients cancel each other by calculating a difference between the background areas. However, in the foreground area, the gradients do not cancel each other, so that the gradient component of the background area remains in the foreground area of a difference image.
In the technique discussed in Japanese Patent Application Laid-Open No. 2001-014474, if an image of a right moving object is obtained, it is easy to determine what the moving object is from a contour of the moving object. However, when the resolution used to determine the background is decreased, this technique cannot robustly maintain determination performance. For example, if it is arranged to determine the background in blocks of 8*8 pixels, determination accuracy can be improved and a processing cost can be decreased. However, if the technique in Japanese Patent Application Laid-Open No. 2001-014474 is used to detect a human figure with a height corresponding to about 100 pixels, because the human figure will appear in a size about four blocks wide and 12 blocks high, the contour of the human figure becomes inconspicuous.
According to the technique in Japanese Patent Application Laid-Open No. 2006-18658, in order to determine whether a moving object is an unnecessary one or not in a method for detecting a moving object, small moving objects are deleted by an expansion and reduction processing, or areas which contact with each other by expanding are combined as a single moving object. However, this technique is unable to discriminate a person from other moving objects with high accuracy. The technique discussed in United States Patent Publication Application No. 2007/0237387 is directed to detecting an object, such as a human figure, in an image. However, when this technique is applied to a moving image, an issue may arise that an area, which looks like a detection target in the background, is detected mistakenly as a moving object.