1. The Field of the Invention
The present invention relates to an apparatus and a method for detecting entities in an image.
2. The Relevant Technology
As is known, safety and security are very important in modern society. Systems have for many years become increasingly widespread which allow for automatic detection of entities (such as people and/or objects) that may be present in digital images like those normally acquired by video cameras.
A first example of a safety system is given by pedestrian detection systems included in many cars now available on the market; such systems acquire digital images from a video camera (usually positioned in the upper part of the windscreen) and process them for detecting a pedestrian that may be in front of the vehicle and estimating the distance therefrom, so as to be able to decide if an assisted maneuver needs to be carried out in order to protect the pedestrian(s) (e.g., increasing the pressure in the braking system, emergency braking, avoiding the pedestrian, or the like).
An example of a security system is given by video surveillance systems capable of automatically detecting the presence of people and/or vehicles (and even the type of vehicle) in an image, and of taking the necessary actions (e.g., starting the recording of the video stream, highlighting the entity by means of a picture superimposed on the video stream, warning a surveillance operator, or the like), thus not requiring a physical surveillance operator to continuously watch the video stream.
Both of these applications are implemented through computer means configured for executing a detection method based on the Histogram of Oriented Gradients (HOG) and Support Vector Machine (SVM) techniques.
In such applications, one of the most important requirements of these detection methods is to produce as few as possible false positives/negatives, because both a false positive and a false negative might lead to unpleasant consequences; in fact, a false negative might lead, for example, to a person being knocked down or a surveillance operator not being warned about an intrusion attempt, whereas a false positive might lead, for example, to unnecessary emergency braking (with the risk of rear-ending) or too many false alarms being signaled to a video surveillance operator (with the risk that the operator's attention level will decrease).
One way to reduce the number of false positives/negatives generated by HOG/SVM-based methods is to increase the resolution of the image processed by such methods, so as to be able to generate histograms of oriented gradients with a greater number of classes and/or with a greater difference among the elements contained in the various classes, thus ensuring higher entity detection accuracy (reduction in the number of false positives/negatives) and/or the recognition of a greater number of distinct entities; in this latter case, a HOG/SVM-based method can be used for discerning between a pedestrian crossing the street and a person running along the edge of the street, or for discerning between a normally dressed person and a person wearing a balaclava, may be for the purpose of not being recognized while performing a criminal action.
However, the increased resolution implies, the available computing power being equal, an increase in the computational load, which limits the use of HOG/SVM-based methods in practical applications like those described above, since stringent time constraints need to be met, which make such applications become real-time applications).
In order to meet these time constraints, it is therefore necessary to increase the number of points (pixels) of an image that can be processed within a time unit by an apparatus configured for executing instructions for implementing a HOG/SVM-based image detection method.