Due to the increase in the availability of digital images (e.g., Internet images, medical images, personal photographs), image classification models are often used to label a large number of images. In one example, image classification models may be used to provide a single high-level label of a histopathology image (e.g., whether it contains cancerous tissue or not). In another example, images classification models may be used to associate images on the Internet with a search term or terms (e.g., bike, bird, tree, football, etc.).
Conventional approaches to building an image classification model rely on strongly supervised learning that require detailed manual annotations of multiple different visual concepts in images (e.g., different objects in photo, cancerous regions in image, type of cancer, etc.) to ensure accurate labeling. Therefore, the conventional approaches are labor-intensive and time-consuming due to the large amount of human involvement, which may not be feasibly scalable to a large number of images that may contain multiple different visual concepts.