With the proliferation of digital photography, automatically classifying images is useful in many applications. To classify an image, the image needs to be semantically understood. Classification is typically formulated as a multi-class or multi-label learning problem.
In a multi-class image classification setting, each image is categorized into one (and only one) category of a set of predefined categories. In other words, only one label is assigned to each image in this setting. In a multi-label setting, which is generally closer to real world applications, each image is assigned one or multiple labels from a predefined label set, such as “sky,” “mountain,” and “water” for a scenery image showing those scenic items.
While multi-class image and solutions are used, both have drawbacks. In general such drawbacks include somewhat poor classification accuracy, as well as classification accuracy that varies depending on the types of images in the dataset.