The present invention relates to training an attentional cascade, also known as a rejection cascade.
As shown in FIG. 1, an attentional cascade 100 includes a sequence of detector functions 120 trained to recognize whether a given image 110 resembles an object of interest. In this specification, the term image can refer to an entire image, or to a portion of an image. Images are passed from one detector function to the next until they are either accepted 160 as resembling an object of interest, or rejected 150 as not resembling the object of interest. An attentional cascade improves object recognition by quickly rejecting images that are easily recognized as not containing the object of interest, while devoting more computation to more difficult images. The speed of an attentional cascade makes it desirable as compared to other methods for performing object recognition, for example techniques using clustering or neural networks, but this speed comes at the expense of identification accuracy.
One conventional method for training an attentional cascade is known as supervised learning. In supervised learning, a supervised learning program trains each detector function in an attentional cascade in sequence by providing a set of example images and a set of counter-example images to a training algorithm. In each training stage, the supervised learning program draws images randomly from a large collection of counter-examples in order to form a set of counter-examples. The supervised learning program retains only images that pass through previously trained detector functions to serve as counter-examples for subsequent training stages. As this process continues and the number of trained stages increases, the training task for each new stage becomes increasingly difficult because the set of counter-examples consists of only images that passed through previous detector functions as false positives, and are thus hard to discriminate from the object of interest.