Object detection and tracking in video sequences may be important in applications such as content-based retrieval, natural human-computer interfaces, object-based video compression, and video surveillance. Classifiers which provide early rejection of non-object patterns may be used for object detection and tracking. In one approach, a number of classifiers may be arranged in a cascade. An input pattern may be evaluated by a first classifier trained to remove a certain percentage of non-object patterns while keeping all object patterns. Second and subsequent stage classifiers may be trained in the same manner. After N stages, the false alarm rate may drop very close to zero while maintaining a high hit rate.
From stage to stage a more complex classifier may be needed to achieve the goal. While the cascade approach has been successfully validated for frontal upright face detection, which tend to be very regular and similar, cascade classifiers may have difficulty handling visually more complex and diverse object classes such as multi-view faces and mouths.