Most methods for classifying objects, such as objects in images, use statistical models that are trained using labeled training data. Typically, a passive learning method accepts randomly selected training data. However, providing and labeling a large labeled training data set is time consuming, while a small labeled training data set reduces the accuracy of the classification.
In active learning, the method selects “useful” data for a user to label, instead of passively accepting randomly selected data. Active learning can significantly reduce the amount of training data required, compared to passive learning while achieving similar classification accuracy as passive learning.
The principal idea in active learning is that not all data are of equal value to the classifier, especially for classifiers that have sparse solutions. For example, with a support vector machine (SVM) classifier, the classification is the same when all training data except the support vectors of a separating hyperplane are omitted. Thus, the support vectors define the hyperplane, and all other data are redundant.
One active learning method is for an SVM in a relevance feedback framework for image retrieval. That method relies on margins for unlabeled examples and works only for a binary class classifier.
Another active learning method minimizes a version space at each iteration. The version space is a subset including all hypotheses that are consistent with the training data. However, that method is also not easily extensible to the multi-class case.
Another method uses Gaussian processes for active learning. That method relies on uncertainty estimates of the examples, obtained through Gaussian process regression, which is computationally expensive (O(N3), and cubic in the number of training examples. That method handles multi-class classification using a one-verses-all SVM formulation. To incorporate multi-class problems, that method adds one example per classifier at each round, because their active learning method is primarily targeted towards binary classification.
Another method for active learning in the multi-class setting finds examples that minimize the expected entropy of the system after the addition of random examples from the active pool, see Holub et al., “Entropy-based active learning for object recognition,” CVPR, Workshop on Online Learning for Classification, 2008. In essence, the method is an information-based approach because that method attempts to gain most information about the unknown labels in order to minimize entropy. That is extremely intensive computationally. The method is time cubic in both the number of unlabeled images N and the number of classes k, i.e., O(k3N3).
Active learning in batch mode with SVMs has been described for the task of image retrieval. That method explicitly handles the selection of multiple images at every iteration through an optimization framework. Because that method is targeted towards image retrieval, the primary classification task is binary; to determine whether an image belongs to the class of the query image.
A confidence based active learning method uses a conditional error as a metric of uncertainty of each example. However, that method only focuses only on the binary classification problem.