Digital Asset Management (DAM) systems are used to collect and store images, videos, or other visual media. In some applications, a client accesses the DAM system to search, download and use the images and videos for advertisements and marketing purposes. However, a client may wish to use a particular object within an image or video. For instance, in an example involving a source image depicting a model wearing a particular dress or article of clothing, a client may wish to modify or otherwise use the image portion depicting the dress or other article of clothing worn by a model. In addition to the article of clothing, the image may also include the model wearing the article of clothing, and a background scene. Due to the number of objects (e.g., the model, the background scene) included in the source image in addition to the desired object (e.g., the dress), a conventional image search is unable to generate an image including the particular article of clothing without other objects from the source image. It is desirable for a system to allow a client to search for a desired object within visual media containing multiple objects.
Further, in some instances, because a desired object is included within a visual medium (e.g., an image or video), the object has undergone distortions that cause the object to be discounted in an image search. For example, the object includes a resolution that causes it to be unrecognizable as the desired object during an image search by a machine-learning algorithm. These distortions prevent machine-learning algorithms from being properly trained to recognize a desired object that is distorted within a visual medium.
In some applications, data scarcity (e.g., an insufficient number of training images) prevents deep learning by a machine-learning algorithm to identify desired objects from an image. For example, a dataset of visual media used to train a system lacks sufficient examples of some images of a particular category to allow the system to recognize images within the category. Thus, it is also desirable for a system to expand a smaller dataset of images to a larger dataset having a sufficient amount of images to appropriately train the system.