1. Field of the Invention
The present invention relates generally to network security and more particularly to detecting malicious network content.
2. Related Art
Presently, malicious network content (e.g., malicious software or malware) can attack various devices via a communication network. For example, malware may include any program or file that is harmful to a computer user, such as bots, computer viruses, worms, Trojan horses, adware, spyware, or any programming that gathers information about a computer user or otherwise operates without permission.
Adware is a program configured to direct advertisements to a computer or a particular user. In one example, adware identifies the computer and/or the user to various websites visited by a browser on the computer. The website may then use the adware to either generate pop-up advertisements or otherwise direct specific advertisements to the user's browser. Spyware is a program configured to collect information regarding the user, the computer, and/or a user's network habits. In an example, spyware may collect information regarding the names and types of websites that the user browses and then transmit the information to another computer. Adware and spyware are often added to the user's computer after the user browses to a website that hosts the adware and/or spyware. The user is often unaware that these programs have been added and are similarly unaware of the adware and/or spyware's function.
Various processes and devices have been employed to prevent the problems that malicious network content can cause. For example, computers often include antivirus scanning software that scans a particular client device for viruses. Computers may also include spyware and/or adware scanning software. The scanning may be performed manually or based on a schedule specified by a user associated with the particular computer, a system administrator, and so forth. Unfortunately, by the time a virus or spyware is detected by the scanning software, some damage on the particular computer or loss of privacy may have already occurred.
In some instances, malicious network content comprises a bot. A bot is a software robot configured to remotely control all or a portion of a digital device (e.g., a computer) without authorization by the digital device's legitimate owner. Bot related activities include bot propagation and attacking other computers on a network. Bots commonly propagate by scanning nodes (e.g., computers or other digital devices) available on a network to search for a vulnerable target. When a vulnerable computer is scanned, the bot may install a copy of itself. Once installed, the new bot may continue to seek other computers on a network to infect. A bot may also be propagated by a malicious web site configured to exploit vulnerable computers that visit its web pages.
A bot may also, without the authority of the infected computer user, establish a command and control communication channel to receive instructions. Bots may receive command and control communication from a centralized bot server or another infected computer (e.g., via a peer-to-peer (P2P) network established by a bot on the infected computer). When a plurality of bots (i.e., a botnet) act together, the infected computers (i.e., zombies) can perform organized attacks against one or more computers on a network, or engage in criminal enterprises. In one example, bot infected computers may be directed to flood another computer on a network with excessive traffic in a denial-of-service attack. In another example, upon receiving instructions, one or more bots may direct the infected computer to transmit spam across a network. In a third example, bots may host illegal businesses such as pharmaceutical websites that sell pharmaceuticals without a prescription.
Malicious network content may be distributed over a network via web sites, e.g., servers operating on a network according to an HTTP standard. Malicious network content distributed in this manner may be actively downloaded and installed on a user's computer, without the approval or knowledge of the user, simply by accessing the web site hosting the malicious network content. The web site hosting the malicious network content may be referred to as a malicious web site. The malicious network content may be embedded within data associated with web pages hosted by the malicious web site. For example, a web page may include JavaScript code, and malicious network content may be embedded within the JavaScript code. In this example, the malicious network content embedded within the JavaScript code may be obfuscated such that it is not apparent until the JavaScript code is executed that the JavaScript code contains malicious network content. Therefore, the malicious network content may attack or infect a user's computer before detection by antivirus software, firewalls, intrusion detection systems, or the like.
Beginning on or about 2009, it became a widespread practice for the authors of bots to use malicious documents in the Portable Document Format (PDF) of Adobe Systems Inc. to propagate web borne attacks. Malicious PDF documents were hosted on web servers controlled by criminals, and then links to them created from many other websites. Innocent users could therefore accidentally, without realizing, browse a website which would cause a malicious PDF to be loaded into their browser, and from their into a PDF reader, which it would then exploit in order to gain control of the user's computer account, or entire computer. From there, malicious bot software would be installed.