1. Technical Field
One or more embodiments of the present disclosure relate generally to machine learning. More specifically, one or more embodiments of the present disclosure relate to systems and methods for training and applying machine learning to learn features of content items such as images.
2. Background and Relevant Art
Image recognition is a central problem in the field of computer vision (e.g., the field of computers acquiring, processing, analyzing, and understanding images in order to produce numerical or symbolic information). One area of computer vision that has shown great progress in the last decade is feature learning. Feature learning plays a role in image recognition by combining computer vision and machine learning to solve visual tasks. In particular, feature learning finds a set of representative features by collecting features from images and learning the features using machine-learning techniques.
Early image recognition systems typically use handcrafted image features. These early image recognition systems focus on spatial pyramid matching techniques that recognize natural scenery, objects from local scale-invariant features, and object categories using the output of a set of predefined category-specific classifiers. As such, these early image recognition systems typically require users to manually identify features in order to train the system to learn and recognize image features. Because these early image recognition systems typically concentrate on low-level features (e.g., the appearance of images), these early systems often require a significant amount of domain knowledge. As a result, these early image recognition systems often do not generalize well to new domains.
More recent image recognition systems shift their focus toward high-level features. In other words, more recent image recognition systems concentrate more on semantics rather than on appearance. These recent image recognition systems, however, still suffer from a number of shortcomings, which has led to the development of current image recognition systems.
Current image recognition systems attempt to learn features directly from data (e.g., images). In particular, current image recognition systems often use supervised training from user-labeled data to perform image recognition. Using data-driven features, current systems appear to effectively outperform the early and recent image recognition systems in some cases. Current image recognition systems do not typically require domain knowledge. This being said, current image recognition systems often do require large-scale category labels (in the order of millions) to properly train the system. Accordingly, current image recognition systems are often limited in domains where labels are difficult to obtain.
As another problem, in domains where labels are difficult to obtain, users typically are required to manually provide domain labels before current image recognition systems can perform image recognition. In some instances, current image recognition systems try to get around the problem of training in a new domain without labels by using labels from related domains. Using labels transferred from other domains, however, typically results in poor image recognition outcomes.
These and other problems exist with regard to image recognition and feature learning in the field of computer vision.