Classifiers are used in many application environments, including machine learning, pattern recognition, and data mining. In general, a classifier provides a function that maps (or classifies) an instance into one of multiple predefined potential classes. A classifier typically predicts one attribute of a set of instances given one or more attributes (or features). The attribute being predicted typically is called the label, and the attributes used for prediction typically are called descriptive attributes. A classifier typically is constructed by an inducer, which is a method that builds the classifier from a training set of sample data. The training set consists of samples containing attributes, one of which is the class label. After a classifier has been built, its structure may be used to classify unlabeled instances as belonging to one or more of the potential classes.
Many different classifiers have been proposed. In application environments in which the amount of negative data is much greater than the amount of positive data, it has been discovered that it is computationally more efficient to decompose a complex single-stage classifier into a cascade of relatively simple classification stages. Such a cascaded classifier typically is designed so that the initial classification stages efficiently reject negative instances, and the final classification stages in the cascade only process the positive instances and the negative instances that are hard to distinguish from positive instances. In cases where the number of negative instances far outnumbers the number of positive instances, such a cascaded classifier is much more efficient than classifiers that process each instance in a single stage.
In a typical cascaded classifier design, each classification stage has a respective classification boundary that is controlled by a respective threshold. The overall classification boundary of the classifier is changed whenever one or more of the thresholds for the individual classification stages are changed. In one cascaded classifier design, the classification stages initially are trained using the well-known AdaBoost inducing method, which provides an initial set of default threshold values for the classification stages. Each classification stage then is optimized individually by adjusting the default threshold value assigned to the classification that in a way that minimizes the incidence of false negatives (i.e., incorrect rejections of positive instances). In this design approach, however, there is no guarantee that the selected set of threshold values achieves an optimal performance for the sequence of classification stages as a whole.