Generic file-classification definitions are often used to classify files based at least in part on the files' features. For example, a security software product may apply a generic file-classification definition to a file encountered by an end user's computing device. In this example, the security software product may compare various features of the file (such as the file's name, size, storage location, source, extension, format, and/or creation date) with the generic file-classification definition. By comparing such features with the generic file-classification definition, the security software product may be able to fairly accurately classify the file as either clean or malicious.
Unfortunately, such generic file-classification definitions may still lead to false positives and/or false negatives in certain scenarios. For example, a security software vendor may generate the generic file-classification definition from a set of training data that includes known clean and/or malicious files. However, after generating the generic file-classification definition and releasing the same to the security software product, the security software vendor may identify new clean and/or malicious files. Since the set of training data did not include these newly identified files, the generic file-classification definition may fail to account for certain information derived from these newly identified files. As a result, the generic file-classification definition may cause the security software product to produce a false negative and/or false positive upon encountering one of these files on the end-user's computing device.
The instant disclosure, therefore, identifies and addresses a need for improved systems and methods for updating generic file-classification definitions to account for newly identified clean and/or malicious files.