Traditional machine-learning algorithms typically incorporate human knowledge (at least implicitly) as an input—for instance, a machine-learning system trained to decide whether a face is present in an image will be given set of human-generated “ground truth” labels that signal whether a human judged the image to contain a face or not. In this case, the job of the machine-learning system is to emit analogous labels (for example, “face present” and “face not present”) in response to new, previously unseen images.
While great progress has been made in the field of machine learning, the performance of machine-learning systems often falls far short human levels of performance, particularly in the domain of machine vision. A key limiting factor is the unavailability of labeled data; it is difficult to provide an algorithm with enough labeled training data to achieve optimal performance. Without sufficiently large datasets, machine-learning algorithms tend to “overfit” the data, adapting to spurious structure present in the training set that is not representative of the larger distribution of all examples in the real world. Machine-learning systems typically combat the effects of overfitting by a process called “regularization,” in which penalties are placed on solutions that are thought to be more likely to be the result of overfitting, typically because they are mom complex or because they exhibit less stable behavior under injected noise.
Thus, there is a need for machine-learning systems and techniques that incorporate not only larger sets of human-labeled data, but also utilize such data to regularize solutions to machine-learning problems in novel ways to better manic human performance.