1. Field of the Invention
The present invention relates to techniques for detecting objects within images. More specifically, the present invention relates to a method and an apparatus for calibrating sampling operations during an object detection process.
2. Related Art
As computer systems are becoming more powerful, they are being used for increasingly computationally intensive tasks involving large images. One such task is “object detection.” The goal of object detection is to determine the presence and location of objects of a given type (such as faces) within a digital image. Typically, object detection begins by training a classifier (an object detector) to recognize the presence of the object of interest within a two-dimensional window of a suitable aspect ratio. The goal of object detection, as of any search problem, is to find the local minima in the detector's response function, which have a response above a given acceptance threshold.
Traditional search methods, such as gradient descent and the simplex method, cannot be applied for this search problem because the surface of the detector response function is flat (except for noise) for any subwindow located away from the object. Therefore, the common approach to object detection is a brute-force search of every subwindow in the image, at every scale in a pre-defined set of scales, and sometimes every rotation in a pre-defined set of rotations. For example, the detector may be trained to determine whether a given 20×20 window of grayscale pixels represents a low resolution frontal view of a human face. To determine whether a digital image contains a face, the detector can be applied to every 20×20 scan window in an image, so that it can take into account a comprehensive set of positions, scales and orientations.
Although the above-described approach is guaranteed to find all occurrences of the object that the detector is able to recognize, it can be prohibitively time consuming for many applications.
Furthermore, some applications can only devote fixed amount of time per image, such as video surveillance systems that provide real-time image analysis. These applications try to do the best they can while keeping up with the frame rate. Other applications can take more time, but need the best intermediate results, such as a computer-assisted person tagging system, in which the user can start correcting the tag assignments before the system has analyzed all images in full. Hence, in some cases comprehensive detection may take more time than the system can allow, and in other cases it is better for the system to spend more time in the hope of finding more instances of the object. Unfortunately, the speed/detection rate tradeoff is hard-coded in traditional systems and cannot be changed dynamically.
Hence, what is needed is a method and an apparatus for detecting an object within an image without the above-described problems.