1. Field of the Invention
The present invention relates to denial of service attack detection, identification, and mitigation. More particularly the invention relates to a system, method, and product for increasing a system's extension of protection of system hardware, software or data from unauthorized access, protecting the system from serving as a platform for maliciously caused denial of service attacks, destruction of data and software, and unauthorized modification.
2. Description of Related Art
a. Overview
Grid computing designers must solve several challenges before widespread commercial deployment can occur. One such challenge is the economic viability of a particular grid computing implementation. To a large extent the economic viability of a particular grid computing implementation is determined by three factors, reliability, security, and weaponry. Reliability as used herein means the computational latency guarantees. Security as used herein means the prevention of compromise of the data in the data stores on the grid. Weaponry as used herein means the resistance of the grid to being used as an identifiable entity in a Distributed Denial of Service (“DDoS”) attack, and more particularly resistance to being maliciously taken over and converted into a platform to launch DDoS attacks on other computer assets.
b. Grid Computing
The concept of a grid generally refers to a form of distributed computing in which various technological components, such as PCs and storage devices, are linked across dispersed organizations and locations to solve a single large computational problem.
A typical grid 11 is shown in FIG. 1. The grid 11 includes, solely by way of illustration and not limitation, five elements, 111, 113, 115, 117, and 119, which are shown generally as work stations. However, the individual elements may themselves be subgrids, LANs, WANs, processors. FIG. 2 illustrates a grid 11, with elements, 111, 113, 115, 117, and 119, and a client workstation 221 accessing the grid 11 through an internet 223.
In this context grid computing is the application of the resources of many computers in one or more networks to a single problem at the same time—usually to a scientific or technical problem that requires a great number of computer processing cycles or access to large amounts of data. A well-known example of grid computing in the public domain is the ongoing SETI (Search for Extraterrestrial Intelligence) @Home project in which thousands of people are sharing the unused processor cycles of their PCs in the vast search for signs of “rational” signals from outer space.
Grid computing requires the use of software that can divide and farm out pieces of a program to as many as several thousand computers. Grid computing can be thought of as distributed, large-scale cluster computing and as a form of network-distributed parallel processing. Grid computing can be confined to a single network of computer workstations within a corporation or it can be a collaboration of a plurality of networks, for example, a public collaboration (in which case it is also sometimes known as a form of peer-to-peer computing).
Grid computing advantages include: (1) the ability to make more cost-effective use of a given amount of computer resources, (2) a way to solve problems that otherwise could not be approached without an enormous amount of computing power, and (3) the concept that the resources of many computers can be cooperatively and perhaps synergistically harnessed and managed as a collaboration toward a common objective. In some grid computing systems, the computers may collaborate rather than being directed by one managing computer.
Types of Grids
Grids can be data grids or computing grids.
A data grid is a grid used for sharing information. At a high level data grid information sharing is like accessing information over the Internet but with deeper content than one would traditionally get, and with more requirements for “heavier lifting” or effort and intensity in terms of computational resources.
A computing grid, on the other hand, is for the heavy crunching of numbers, and for telescoping the time necessary to arrive at the answer. The smallpox and Anthrax grids that IBM supports with grid.org are examples of that.
Security and Privacy
Security and privacy issues have to be completely thought out by the grid masters, particularly if the grid will be a multi-company or multi-entity project. Entities that don't institute security measures run the risk of attack by anybody who owns a machine on the grid being able to ‘eavesdrop’ on grid computations running on that unit, and even distribute zombie software for subsequent denial of service attack.
c. Denial of Service Attacks
On the Internet, a denial of service (DoS) attack is an incident in which a user or organization is deprived of the services of a resource they would normally expect to have. Typically, the loss of service is the inability of a particular network service, such as e-mail, order entry, transaction processing, or database management, to be available or the temporary loss of all network connectivity and services. In the worst cases, for example, a Web site accessed by millions of people, such as on line banking, credit card processing, airline and other travel reservation processing, e-commerce, and on-line auction services, can occasionally be forced to temporarily cease operation. A denial of service attack can also destroy programming and files in a computer system. Although usually intentional and malicious, a denial of service attack can sometimes happen accidentally. A denial of service attack is a type of security breach to a computer system that does not usually result in the theft of information or other security loss. However, these attacks can cost the target person or entity a great deal of time and money.
FIG. 3 illustrates a grid 11 (with elements 111, 113, 115, 117, and 119, shown generally as work stations) and a client workstation 221 accessing the grid 11 through an internet 223 to initiate a DDoS attack 341, by planting harmful code in grid elements 115 and 117 (taken over as zombies) to stage attacks 351A and 351B on targets 331 and 333 which are external to the grid 11.
Common forms of denial of service attacks are:
Buffer Overflow Attacks
The most common kind of DoS attack is simply to send more traffic to a network address than the programmers who planned its data buffers anticipated someone might send. The attacker may be aware that the target system has a weakness that can be exploited or the attacker may simply try the attack in case it might work. A few of the better-known attacks based on the buffer characteristics of a program or system include:                Sending e-mail messages that have attachments with 256-character file names to Netscape and Microsoft mail programs        Sending oversized Internet Control Message Protocol (ICMP) packets (this is also known as the Packet Internet or Inter-Network Groper (ping) of death)        Sending to a user of the Pine e-mail program a message with a “From” address larger than 256 charactersSYN Attack        
When a session is initiated between the Transport Control Program (TCP) client and server in a network, a very small buffer space exists to handle the usually rapid “hand-shaking” exchange of messages that sets up the session. The session-establishing packets include a SYN field that identifies the sequence in the message exchange. An attacker can send a number of connection requests very rapidly and then fail to respond to the reply. This leaves the first packet in the buffer so that other, legitimate connection requests can't be accommodated. Although the packet in the buffer is dropped after a certain period of time without a reply, the effect of many of these bogus connection requests is to make it difficult for legitimate requests for a session to get established. In general, this problem depends on the operating system providing correct settings or allowing the network administrator to tune the size of the buffer and the timeout period.
Teardrop Attack
This type of denial of service attack exploits the way that the Internet Protocol (IP) requires a packet that is too large for the next router to handle be divided into fragments. The fragment packet identifies an offset to the beginning of the first packet that enables the entire packet to be reassembled by the receiving system. In the teardrop attack, the attacker's IP puts a confusing offset value in the second or later fragment. If the receiving operating system does not have a plan for this situation, it can cause the system to crash.
Smurf Attack
In a smurf attack, the perpetrator sends an IP ping (or “echo my message back to me”) request to a receiving site The ping packet specifies that it be broadcast to a number of hosts within the receiving site's local network. The packet also indicates that the request is from another site, the target site that is to receive the denial of service. (Sending a packet with someone else's return address in it is called spoofing the return address.) The result will be lots of ping replies flooding back to the innocent, spoofed host. If the flood is great enough, the spoofed host will no longer be able to receive or distinguish real traffic.
Viruses
Computer viruses, which replicate across a network in various ways, can be viewed as denial-of-service attacks where the victim is not usually specifically targeted but simply a host unlucky enough to get the virus. Depending on the particular virus, the denial of service can be hardly noticeable ranging all the way through disastrous.
Zombie Attacks
In at least one form of denial of service attack, one or more insecure assets, such as PC's, workstations, or Web servers, are compromised by malicious attackers who place code in each intermediate target which, when triggered, will launch an overwhelming number of attacks, such as service requests, toward an attacked ultimate target, typically a target Web site. The ultimate target will soon be unable to service legitimate requests from its users. A compromised intermediate target that is used as an attack launch point to launch DDoS attacks upon an ultimate target is known as a zombie.
While the usual zombie attack consists of a steady (and therefore more easily traced) stream of attack traffic intended to overwhelm one or more target computers, a pulsing zombie attack consists of irregular bursts of traffic intended to hamper service. It is more difficult to locate the source of an attack from a pulsing zombie, or even to know that an attack has taken place. Pulsing zombie attacks have been known to go on for months before they are detected; in one case, a victim received six times its normal traffic volume for several months.
d. Denial of Service Attacks In a Grid Computing Environment
Resistance of the grid and the grid elements to being identifiable elements, that is intermediate targets or potential zombies, in a DDoS attack is an overarching issue limiting commercial deployment of grid installations. To date, DDoS have been very costly to a small sub-set of targets. However, the spread of zombies has heretofore been considered to be in multiple and unrelated portions of IP address space. For this reason any bad practices, such as failure to do due diligence, by the subnet administrators responsible for the zombies have not been readily identifiable.
As described above, there are many kinds of DDoS attacks. A simple DDoS attack might be a flood of TCP SYN packets, a flood of UDP packets to a well-known port such as port 53 (DNS) or 161 (SNMP), or a flood of ICMP PING packets. In particular, TCP SYN floods have been an unfortunate part of Internet business risks. This has led to countermeasures such as TCP splicing and huge connection tables in firewall accelerators.
In contrast to brute force floods, a more complex DDoS might establish a TCP session with the victim and then overwhelm the victim with Port 443 (SSL HTTPS) secure session initiators that never complete or that are purposefully malformed. The important advantage of SSL floods to an attacker or perpetrator would be that far fewer sources would be needed. Even a large SSL server may be able to handle only a few thousand SSL initiations per second. This is in contrast to a firewall accelerator that holds a million session in a connection table.
The common theme in all DDoS attacks is to recruit zombies that act upon a signal (including a time of day signal generated by the operating system) to send to an ultimate victim so much traffic of a particular kind that computational resources of the ultimate victim are overwhelmed.
For this reason, weaponry, that is, the resistance of the grid to being used as an identifiable entity (that is, an intermediate target or zombie) in a Distributed Denial of Service (“DDoS”) attack becomes a design, implementation, and deployment issue.
An attacker might remotely discover the vulnerability of a grid or its elements, for example by finding many machines with similar IP addresses (or the same IP address in the presence of a NAT) with many ports open (available and responding). Generally, these would not be well known port numbers. For example, most port numbers 9000 to 32000 would not be well known port numbers.
To an attacker, it might be obvious that machines similar in address would also be similar in operating system, applications, service pack levels, and patch levels, and to therefore have the same vulnerabilities. That is, a massive grid of many nodes may be vulnerable in the same sense as an entire network of Windows 2000 machines. For an attacker, many machines with the same unpatched vulnerability could be taken control quickly. This could be with, for example, a worm infection or an auto rooter. The result is that it would be easy to compromise several machines in one grid. In the case of a DDoS attack, the attack may make the attacked machines into zombies.
Once elements in the grid have been taken over by a DDoS attack, it is relatively easy to prove that much or most of the subsequent outgoing attacks came from a specific grid.
Thus, there is a need to detect outbound attack traffic from an infected grid to facilitate identification, reaction, and remediation, and limit the participation of the grid in a subsequent DDoS attack.
Moreover, a need exists to include recognition of grid participation in a DDoS attack by statistical measures that are indicative of a DDoS attack, enabling effective and automatic response to a DDoS attack.