Neural networks are used for performing complex tasks, for example, classification tasks in recognizing patterns or objects in images, natural language processing, computer vision, speech recognition, bioinformatics, and so on. The quality of result of a neural network depends on the quality of training of the neural network. Training such neural networks requires labelling of large amount of training data for identifying different classes of data, e.g. positive and negative examples. However, many training datasets misrepresent or underrepresent the data they're intending to train neural networks for.
For example, training datasets for a neural network for recognizing objects in an image may include images containing gorillas, and images containing dolphins. The neural networks may consider that an object in an image of jungle and/or green-centric sceneries is a gorilla, and an object in an image of ocean and/or blue environments is a dolphin. The neural networks may misclassify a photo of a human being in a jungle scenery or a photo of a swimming person. This is especially likely in classification tasks seeking to distinguish amongst a similar group (e.g. recognizing two different types of trees vs. recognizing a tree from a dog). Likewise some training datasets may include data that is obtained in a particular setting that may not match the setting of data processed by the trained neural network. For example, an image may be captured in one manner (e.g., with a specific camera, camera setting, indoor, etc.) and an input image that is processed may not conform to those same particularities.
Additionally, generating labelled samples of large training datasets can be an expensive process that requires manual processing. Therefore, conventional techniques for training neural networks are often inadequate and can result in misclassification of input data.
The features and advantages described in the specification are not all inclusive and in particular, many additional features and advantages will be apparent to one of ordinary skill in the art in view of the drawings, specification, and claims. Moreover, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the disclosed subject matter.