Nearly all of tasks, which need to learn a classifier or regression model, require some amount of samples labeled. For instance, if we will learn a concept detector for “cat”, we need to label some amount of “cat” images and some non-cat images so that a concept detector can be learned from them. The learned concept detector can be used to determine the relevance of an image to the query “cat” provided to a search engine.
However, labeling efforts are often expensive in terms of the human labor that is involved in labeling large sets of samples. Thus, in many instances it is difficult to label a sufficiently large amount of samples. Active learning techniques can be utilized to reduce the amount of labeled samples. In other words, given the same amount of labeling efforts, active learning can lead to a better classifier than a traditional passive classifier. Active learning techniques include selecting sample images from the group of images for labeling by one or more human users.
While active learning techniques prove effective to label large groups of elements, these techniques suffer from failing to consider the distribution difference of the sampled images between the training and the test. As such, the techniques may not produce an accurately-trained model.