1. Field of the Invention
The present invention relates to an apparatus that detects a moving object from an image, and a method thereof.
2. Description of the Related Art
There is a method for detecting a moving object such as a person or a car in time-sequential images by previously extracting, from the images, regions in which there is motion and then performing detection. Such a detection method is effective in terms of processing speed and accuracy.
There are various methods for performing human detection using still images, such as Histograms of Oriented Gradients (HOG)+AdaBoost method. Such methods perform human detection using only a feature of a shape or a texture, so that misrecognition may occur as a result of noise, an incidental texture, or an arrangement of the object. More specifically, misrecognition may occur when there is an object that is visually similar to the object to be detected, or when a similar feature is accidentally generated at a certain time. However, such misrecognition can usually be prevented by using motion information.
Methods for extracting moving regions areas follows. A background difference method extracts a target region based on information on a difference between previously prepared background information and an image of the current frame. An interframe difference method extracts changing components between frames from continuous frames. An optical flow method uses a Lucas-Kanade algorithm or block matching.
However, it is difficult for an object detection method based on only extraction of motion information to perform accurate detection in cases as described below. Accurate detection is difficult when a noise component such as a shadow or lighting fluctuation is included in the image, when the background changes such as trees swaying in the wind, or when the background information changes due to movement of a camera.
To solve such a problem, there is a method in which a region of interest (ROI) is previously limited based on the motion information, and the object is detected by performing template matching with respect to the ROI.
For example, Japanese Patent Application Laid-Open No. 2007-164720 discusses detecting a head of a person by applying an ellipse to an image region extracted by performing the background difference method. The detection accuracy of the object is thus improved by extracting the moving region and performing template matching.
Further, Japanese Patent Application Laid-Open No. 2006-79272 and Japanese Patent Application Laid-Open No. 2008-225734 discuss detecting a person by quantifying the feature of the motion.
However, the technique discussed in Japanese Patent Application Laid-Open No. 2007-164720 assumes that the object is moving, so that the object cannot be detected if the object has stopped moving. The stationary detection object may be extracted using the background difference method. In such a case, it is necessary to sequentially update the background information to perform accurate detection, so that if the detection object has stopped for a predetermined length of time, the detection target becomes included in the background information. The object cannot thus be detected.
Further, Japanese Patent Application Laid-Open No. 2006-79272 is directed to selecting a person who has fallen while walking, and Japanese Patent Application Laid-Open No. 2008-225734 is directed to selecting an abnormal action in an elevator, i.e., only specific actions. The techniques cannot detect motions other than such specific actions. Further, if there is motion other than that of the detection object, such as a car passing in the background out of doors, it becomes difficult to perform human detection using the motion information.
As described above, when the detection object moves and stops, the detection object cannot be accurately detected using the conventional techniques. For example, when human detection is to be performed in an ordinary environment, there are many situations in which the person does not move. As a result, human detection cannot be performed using only the motion information, or the detection accuracy becomes lowered due to usage of the motion information.
However, in a system that performs human detection in an image, the detection accuracy is expected to be improved using the motion information in addition to the shape information in still images. There is thus a demand for a technique using the motion information which solves the problem of misrecognition or disability of detection that occurs when only performing detection using the still images.