Existing anti-virus technologies are becoming increasingly ineffective at protecting computing resources from malicious files and programs, such as viruses and other types of malware, leading to the investigation of alternate technologies. One promising area of development is in file “whitelisting,” a system in which only applications, files, or programs contained within a defined list of items may be accessed or executed by a computing system, while all other files or programs are prevented from running on the computing system.
Conventional whitelist systems rely on either manually-created whitelists or web-spidering (often referred to as web-crawling) techniques to identify legitimate (or potentially legitimate) files. However, given the velocity of new applications created and published (oftentimes via the Internet) on a daily basis, it is practically impossible to manually create a comprehensive whitelist of legitimate files.
Moreover, conventional web-spidering techniques typically only identify a portion of known legitimate files, estimated as low as 10%, due to various limitations in web-spidering technology. For example, web-spidering techniques have difficulty accessing and analyzing files that are only accessible after a user fills out an online form and/or purchases the file via an electronic transaction. Conventional web-spidering techniques are also prone to falsely identifying illegitimate files as legitimate, and vice-versa, further limiting the viability of the whitelist.