1. Field of Invention
The present patent document is directed towards systems and methods for object detection. More particularly, the present patent document is directed towards systems and methods that consider background when generating and using object detection models.
2. Description of the Related Art
Human detection from images can be important to many applications, such as surveillance, robotics, and automotive safety. However, human detection, indeed any type of object detection, is among the most challenging vision tasks due in part to the great variety in appearance and shape of human figures, highly cluttered environments, and often low resolution and low quality image sources.
Predominant approaches for human detection usually scan the input images with sliding windows at various scales to identify the locations and scales of the contained human figures. To determine whether a local window includes a human figure, both generative and discriminative approaches have been developed. The generative approaches typically infer the posterior probability for pedestrian (human) class using discrete or continuous shape models, or combining shape and texture models. The discriminative approaches extract image features in the local window and construct classifiers for detection. For this purpose, various features have been proposed, such as Haar wavelet features, gradient-based features, shape-based features, combination of multiple features, automatically mined features, or pose-invariant features. The local features are then used to identify humans in the classification process by algorithms such as AdaBoost or support vector machine (SVM). This process typically has been either targeted at the human figure as one object or is based on parts detectors. The parts-based methods treat the human figure as an assembly of different body parts. Individual parts are detected and the results combined to achieve classification on the whole human figure.
It must be noted, however, that these methods only utilize the information inside a human region. That is, these detection methods seek to detect the object without taking into account the neighborhood context. A significant issue is that the input images are often captured under different conditions, such as different levels of exposure, different ISO settings, different backgrounds, and the like. The varying contrast levels in the input image create challenging difficulties for detection systems.
Accordingly, systems and methods are needed that can address the challenges presented by varying backgrounds when trying to detect an object or item in an image.