Conventional object detection applications extract a significant number of multi-scale features from an image in order to enable detection of an object in the image. For example in face detection, conventional solutions may extract approximately 200,000 features in an image with a maximum dimension of 1,480. It is to be appreciated that extracting such a high number of features can be very time consuming, and thus becomes the bottleneck of conventional object detection processes in terms of speed.
To recognize an object, such as a face, conventional exemplar-based object detection methods use a large collection of exemplars as classifiers. The detection procedure is computationally expensive as the similarity between each test region and each exemplar needs to be calculated. Some recent solutions select the exemplars that are the most informative in determining the existence of objects. However, these solutions still use a relatively large number of exemplars (e.g., 3,000 exemplars) to detect an object. When 3,000 exemplars are used, the detection stage (excluding feature extraction) may take up to 2 seconds for a 1480×986 image.