In the span of just a few years, spyware has become the Internet's most “popular” download. A recent scan performed by America Online/National Cyber Security Alliance (AOL/NCSA) of 329 customers' computers found that 80% were infected with spyware programs. More shocking, each infected computer contained an average of 93 spyware components. As used herein and in the claims that follow, a definition of the term “spyware” provided by the online encyclopedia Wikipedia™ is applied. Wikipedia™ defines spyware as “a broad category of malicious software designed to intercept or take partial control of a computer's operation without the informed consent of that machine's owner or legitimate user.” Wikipedia™ further notes that “while the term [spyware] taken literally suggests software that surreptitiously monitors the user, it has come to refer more broadly to software that subverts the computer's operation for the benefit of a third party.” Adware, which displays advertising for a service or product, may be a form of spyware, if it is installed without a user's consent. Most users are willing to accept the display of sponsored popup advertising as a necessary result of being enabled to visit a Web page that provides a desired benefit at no other cost to the user. However, if the adware installs any software component on a user's computer without the user's knowledge or agreement, or continues displaying advertising when the user is accessing other sites, the adware is properly viewed as spyware.
While specific spyware may be designed to simply gather information that would generally be viewed as innocuous, such as logging the Web pages that a user visits for purposes of more effectively targeting advertising to customers, other forms of spyware can deliver unsolicited pop-up advertising when the user visits unrelated Web pages that don't benefit from sponsored advertising that is displayed, or the spyware can surreptitiously gather personal information about a user, including a user's social security number or credit card numbers, or can change a user's home page, or redirect Web page requests entered by a user to a different Web site, e.g., one that solicits the user to access pornography.
The consequences of spyware infections can be severe, and can include inundating the spyware victim with pop-up ads that open faster than a user can close them, or enabling the victim's financial information to be used by a third party to purchase merchandise or withdraw funds from a user's bank account, or for stealing passwords. Another form of spyware that is sometimes referred to as “malware” may even render the victim's computer useless. At the very least, the spyware installed on a computer diverts system and processor resources away from the tasks desired by a user and can dramatically slow computer response time in carrying out those tasks or in loading the desktop. In many cases, the user will not even be aware of what is causing these problems, since the installation of the spyware is done without the user's consent and knowledge.
Spyware typically installs itself surreptitiously through one of two methods. First, a user might choose to download software to which piggy-backed spyware code has been attached. For example, a user may initiate download of a desired utility file, and the piggy-backed spyware will be included with the download and automatically installed when the utility program is installed. Piggy-backed spyware is particularly common with file-sharing software. The file-sharing Kazaa™ system alone has been the source of hundreds of millions of spyware installations. Second, a user might visit a Web page that invisibly performs a “drive-by download” attack (sometimes also referred to herein as a “drive-by installation”), exploiting a vulnerability in the user's browser to install software without the user's consent. In each case, it is unlikely that the user will have any indication that spyware has been installed. It is only when the adverse effect of the spyware is experienced that a user may become aware that the spyware installed on the computer is preventing the user's computer from working as it did before becoming infected.
In previous work related to spyware, passive network monitoring was used to measure the extent to which four specific adware programs had spread through computers on the University of Washington campus. In a report of this work, the spyware problem was studied from a different perspective. Specifically, the study measured the extent to which: (1) executable Web content contains spyware; and, (2) Web pages contain embedded drive-by download attacks. Both studies confirmed the existence of a significant spyware problem.
The AOL/NCSA online safety study mentioned above conducted a poll of 329 households and also examined their computers for the presence of spyware. Over half of the respondents believed their machines were spyware-free. In reality, 80% of computers scanned were infected with spyware programs. The AOL/NCSA study did not attempt to identify how these computers became infected.
A recent edition of the “Communications of the ACM” contained over a dozen articles on the spyware problem. These articles discuss issues such as the public perception of spyware, security threats caused by spyware, and frameworks for assessing and categorizing spyware programs.
Many projects have examined the detection, measurement, and prevention of malware, such as worms and viruses. Some of their techniques may ultimately be applicable to the detection and prevention of spyware. None of the current approaches for identifying Web pages that install spyware are able to detect such a Web page on-the-fly, in real time, as a user is about to open the Web page in a browser or download an executable file.
Although a number of different commercially available programs can be employed to scan a computer system to detect known spyware, by the time that the spyware is thus detected and removed, the user may have experienced significant problems and the efficient operation of the user's computer may have been adversely impacted. An active Internet user can unknowingly be exposed to multiple sources of spyware each day, so that even if a spyware scanning program is used each evening while the computer is not otherwise in use, the spyware installed that day may already have adversely impacted the user before it can be detected and removed.
Accordingly, in addition to identifying Web pages that carry out drive-by installation of spyware and executable files that include piggy-backed spyware based on Web crawling by a dedicated entity, it would be desirable to seamlessly detect spyware in real time and on-the-fly, before it is installed on a user's computer system. It would also be desirable to provide this detection without the interaction of the user and to preclude the user from downloading Web pages and executable files that install spyware. In some cases, it may be desirable to detect the spyware in real time using a centralized computing device to which a user's computing device is connected. Alternatively, it may instead be desirable, for example for home use, to enable the user's computing device to detect spyware threats from Web pages and/or executable files before they are accessed by the user, or to employ some combination of these approaches.