In a reputation-based security system, a security-software vendor may attempt to determine the trustworthiness of a file by collecting, aggregating, and analyzing information from potentially millions of user devices within a community, such as the vendor's user base. For example, by determining a file's origin, age, and prevalence within a community, among other details (such as whether the file is predominantly found on at-risk or “unhealthy” machines within the community), a security-software vendor may gain a fairly accurate understanding as to the trustworthiness of the file.
Unfortunately, prior to collecting sufficient information about a file, reputation-based security systems may be unable to accurately determine the trustworthiness of the file. As a result, rather than running the risk of producing a false negative or false positive, reputation-based security systems may classify the file's trustworthiness as unknown and allow users to download or install the file at their own discretion. In this example, upon encountering a file whose trustworthiness is unknown, some users within a community may decide to download or install the file based on a personal knowledge of or belief in the file's (or file source's) legitimacy.
Although such user actions (e.g., downloading or installing the file) may provide additional information about the trustworthiness of the file based on users' personal knowledge, current reputation-based security systems typically fail to take advantage of this additional source of information when classifying the trustworthiness of files. As such, the instant disclosure identifies a need for systems and methods for classifying unknown files based at least in part on actions taken by users within a community.