1. Field of the Invention
This invention pertains in general to computer security and in particular to the identification of websites that host or disseminate malicious software (malware).
2. Description of the Related Art
There is a wide variety of malicious software (malware) that can attack modern computers. Malware threats include computer viruses, worms, Trojan horse programs, spyware, adware, crimeware, and phishing websites. Modern malware is often designed to provide financial gain to the attacker. For example, malware can surreptitiously capture important information such as logins, passwords, bank account identifiers, and credit card numbers. Similarly, the malware can provide hidden interfaces that allow the attacker to access and control the compromised computer.
One method by which malware can attack a computer is through a user's access of a website that hosts or disseminates malicious code (“malicious websites”). The identification of a set of known malicious websites enables users of client computer systems running security software to receive alerts and notifications which enable the users to assess the risk of and/or block access to the malicious websites.
Due to code obfuscation and other techniques employed by malicious websites to evade detection, it is often necessary to access the website in order to determine whether the website hosts or disseminates malicious software. Once the website has been accessed, techniques may be employed to identify commands and transactions on a computer system accessing the website that indicate malicious behavior. Although these techniques are reliable for identifying malicious websites, they require significant computational resources. Due to the massive number of websites a user can potentially visit, these techniques are not for scalable for identifying a comprehensive set of known malicious websites for use in security software.
Accordingly, there is a need in the art for scalable methods of identifying malicious websites.