An asset management system may create or modify image tags that describe some aspect of the images (e.g., the subject matter, location, time period, etc.). One way of automatically applying a tag to an image is to find a semantically similar image with one or more tags (e.g., an image that depicts similar shapes, spatial relationships, colors, and other visual content) and apply the tags from the semantically similar image to the untagged image. For example, FIG. 1 depicts a sample image 102 of a person kicking a football. The sample image 102 has a tag 104 stating “football game.” An automatic tagging process 108 uses the sample image 102 to add a tag to an input image 106. For example, the tagging process 108 can determine that both the sample image 102 and the input image 106 depict a person with a football and, based on that determination, modify the input image 106 to create a tagged image 106′ having the tag 110 with the same content (i.e., “football game”) as the tag 104.
To identify semantically similar images, classifier algorithms, such as neural network algorithms, can be used to classify different images as being semantically similar to one another. To accurately classify different images, the classifier algorithm is “trained” to recognize certain semantic content (e.g., content in a set of training images depicting trees) and associate that semantic content with a class (e.g., a class labeled “trees”). Through this training process, the classifier algorithm learns which semantic features should be used to assign images to a given class. For example, even though an image of a palm tree and an image of pine tree may depict different shapes (e.g., the shape of the leaves, the color of the tree trunk, etc.), if a classifier algorithm is instructed that both images belong to the “trees” class, the classifier algorithm can learn to identify which semantic features (e.g., a tree trunk with leaves on top) should be used to assign other images to the “tree” category.
Image tags can provide the classes to be used by a classifier algorithm. In a simplified example, a set of fifty training images may include forty images with a tag “dog” and ten images with the tag “boat.” The classifier algorithm may learn that certain semantic content (e.g., four legs and a tail) included in the subset of forty images should be associated with the “dog” tag and that certain semantic content (e.g., a sail) included in the subset of ten images should be associated with the “boat” tag. If an untagged image of a boat is later received, the trained classifier algorithm can determine that the image should be associated with the “boat” tag.
Using a small set of training images can present disadvantages. If a set of training images is too small, the classifier algorithm will not learn how to properly classify certain types of semantic content. For example, images of buildings or fish may not have semantic features in common with images of boats and dogs. Thus, a classifier algorithm may not be able to accurately classify images of buildings or fish. Another disadvantage is that the small number of tags may prevent the classifier algorithm from learning how to classify images into more descriptive categories. For example, images of sailboats and aircraft carriers may both be classified as “boats” even though these images include very different types of semantic content.
The accuracy of a classifier algorithm may be improved by using a large set of training images with large numbers of tags. However, training a classifier algorithm using very large sets of tagged training image may be computationally infeasible. For example, a database of images may include millions of images, with each image having thirty or more tags, which results in millions of images with thousands of tags. A computing device that executes a classifier algorithm may have insufficient processing power, storage, or other computing resources that are required to train the classifier algorithm how to classify millions of images using classes defined by thousands of tags.
It is therefore desirable to enable a classifier algorithm that is used for image tagging to be trained using large sets of training images and their associated tags.