The following relates to the image classification arts, image indexing arts, image retrieval arts, and related arts.
Image classification typically entails the operations of training a classifier using a training set of images labeled with class identifications (i.e. class labels), and then applying the classifier to an input image to be classified. This approach relies upon the availability of a suitably comprehensive training set including a suitably representative sub-set of images for each class of the classification system.
If a suitably comprehensive training set of labeled images is unavailable, then the effectiveness of the foregoing approach is poor for classes that are not well-represented in the training set. It is even impossible to train classifiers for those classes that do not have a single labeled sample in the training set. In such cases, a solution is to introduce an intermediate representation between the image descriptors and the classes. Attribute class descriptions are an example of such an intermediate representation. They correspond to high level image descriptors that are meaningful for, and shared across, multiple classes. By way of illustrative example, attributes for classifying images of animals could be “has paws”, “has wings”, “has four legs”, “has snout”, “is underwater”, and so forth. The standard approach to perform image classification with attribute descriptions is a two-step process, known as Direct Attribute Prediction (DAP). DAP employs image attribute-level classifiers to compute image attribute probabilities for the image (one classifier per image attribute); followed by a Bayesian classifier computing class probabilities based on image attribute probabilities output by the image attribute classifiers.
The use of DAP enables classification of images into classes for which there are no examples in the training set. Such “zero shot” learning relies upon characterization of the class by the image attributes that images belonging to the class exhibit (or lack).