Many modern network security applications monitor network devices on a network to attempt to determine whether any network device has been infected with a malicious application, such as a virus or a malware. The security application is typically trained to determine whether a network device is infected with a malicious application by comparing and contrasting the network device to a training dataset that includes a set of infected network devices and a set of clean network devices.
Unfortunately, however, the task of building an accurate training dataset can be difficult in modern network environments. To avoid falsely declaring that a network device is clean, training datasets will often err on the side of declaring a network device to be infected, thus resulting in a problem of having false positives in training datasets. False positives in a training dataset renders the training dataset inaccurate, thus resulting in poorly trained security applications that perpetuate the inaccurate identification of infected network devices.
The subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one example technology area where some embodiments described herein may be practiced.