§1.1 Field of the Invention
The present invention concerns Internet security. More specifically, the present invention concerns defending against distributed denial of service (DDoS) attacks on networks.
§1.2 Background Information
§1.2.1 Defense Systems for DDoS Attacks
DDoS attacks aim to interrupt localized Internet services by making them temporarily unavailable by flooding the victim (a single Web host or an entire stub network served by an ISP) with a high volume of legitimate malicious packets originating from many different sources. To stop DDoS attacks, while they are in course, without manual identification, characterization, and filter configuration on ISP routers, methods based on marking and traceback protocols (See, e.g., S. Bellovin, M. Leech, and T. Taylor, “ICMP Traceback Messages,” draft-ietf-itrace-01.txt, Internet draft, October 2001; and A. Yaar, A. Perrig, and D. Song, “FIT: Fast Internet Traceback,” IEEE Infocom, March 2005.) and pushback mechanisms (See e.g., J. Ioannidis and S. M. Bellovin, “Implementing Pushback: Router-Based Defense Against DDoS Attacks,” Network and Distributed System Security Symp., February 2002; and D. K. Y. Yau, J. C. S. Lui, and F. Liang, “Defending Against Distributed Denial-of-Service Attacks with Max-min Fair Servercentric Router Throttles,” IWQoS, 2002) have recently been proposed. Intrusion pattern recognition has also been proposed by the data mining community to automate extraction of hidden predictive information from databases, including offline machine-learning approaches (See, e.g., W. Lee and S. J. Stolfo, “Data Mining Approaches for Intrusion Detection,” the 7th USENlX Security Symp., January 1998; and D. Marchette, “A Statistical Method for Profiling Network Traffic,” the 1st USENIX Workshop on Intrusion Detection and Network Monitoring, April 1999) and online as is the D-WARD approach (See, e.g., J. Mirkovic, G. Prier, and P. Reiher, “Attacking DDoS at the Source,” ICNP, November 2002). A combination of static and dynamic statistical filters has also been proposed in (See, e.g., Q. Li, E. C. Chang, and M. C. Chan, “On the Effectiveness of DDoS Attacks on Statistical Filtering,” IEEE Infocom, March 2005).
There are also commercial products such, as Asta Networks and Cisco (See, e.g., Asta Networks Inc., http://www.astanetworks.com; and Cisco NetRanger Overview, http://www.cisco.com/univercd/cc/td/doc/product/iaabu/csids/csids1/csidsug/overview.htm), that detect and mitigate specific types of known DDoS attacks, especially those generated by well-known DDoS attack tools. However, their signature-based approach makes them vulnerable for new types of DDoS attacks. Arbometworks' product (See, e.g., Arbornetworks Com., http://www.arbometworks.com) mitigates DDoS attacks with the traceback approach, requiring the precise characterization of the attacking packets. Mazu, Riverhead (currently Cisco) and Cyberoperations products (See, e.g., Mazu Networks Inc., http://www.mazunetworks.com; and Cyber-operation Com., http://www.cyberoperations.com) are built on statistics-based adaptive filtering techniques. Most of these solutions do not fully automate packet differentiation and discarding. Instead, they only recommend a set of binary filter rules to the network administrator.
It would be useful to provide a DDoS defense system that is flexible enough to cope with new and more sophisticated attacks in the future, and that offers online automated approaches that are more scalable in terms of network operating speed and the number of potential targets to be protected. PacketScore, (See, e.g., Y. Kim, W. C. Lau, M. C. Chuah, and H. J. Chao, “PacketScore: Statistics-based Overload Control against Distributed Denial of Service Attacks,” IEEE Infocom, April 2004), proposes a statistics-based overload control approach that efficiently addresses key scalability issues in a backbone implementation, allowing a large number of targets to be protected at high speed. It is a proactive defense system by nature, able to detect and block never-seen-before attacks. Essentially, it detects and filters DDoS attacks based on a packet-scoring approach. Arriving packets are given scores based on their packet attribute values (in IP, TCP or UDP header) as compared to nominal traffic profiles, and selectively discarded if their scores are below a dynamic threshold.
Although PacketScore is promising, it would be useful to provide improved packet scoring schemes. For example, it would be useful to lower implementation complexity, increase attack detection and differentiation accuracies, and increase adaptability against complex DDoS attacks.
§1.2.1.1 Perceived Limitations of the Current CLP-Based Packetscore Scheme
Here, the previously proposed PacketScore scheme is reviewed. FIG. 5 depicts the support of distributed detection and overload control by multiple Detecting-Differentiating-Discarding Routers (3D-Rs) 510 on a defense perimeter and DDoS Control Servers (DCSs ) 530/540. Let n be the total number of 3D-Rs along the defense perimeter. The use of DCS 530/540 reduces the peer communications from 3D-Rs O(n2) to O(n), and spares the 3D-Rs 510 from the burden of managing a large number of per-end-point-target nominal traffic profiles. Since a DCS 530/540 exchanges only control messages with the 3D-Rs 510, it can be safely kept away from the normal data path (out of the reach of potential DDoS attack traffic). To facilitate load balancing and improve scalability, the set of potential end-point targets within a domain can be partitioned among multiple DCSs 530/540.
The PacketScore scheme uses a statistic-based Bayesian method called Conditional Legitimate Probability (CLP) to calculate packets' scores, (hereinafter referred to as “the CLP-based scheme”). It consists of the following three phases. First, an attack detection and victim identification phase might be performed by monitoring four key traffic statistics of each protected target (packets-per-second, bits-per-second, number of active flows, and new arriving flow rate) while keeping minimum per-target states. The key traffic parameters are compared to the nominal traffic profile parameters. A DCS 530/540 aggregates the reports from multiple 3D-Rs 510 on a defense perimeter, to confirm if there is actually an ongoing attack.
A second phase might differentiate attacking packets from legitimate ones by giving a score to every packet destined to the identified victim. Scores are determined by comparing every packet's current traffic profile against its nominal traffic profile. More specifically, they are computed by CLP, and stored in the form of scorebooks. By this method, the attribute value shared by attacking (legitimate) packets will be assigned a lower (higher) score, because of its relative frequency increase (decrease) in current traffic profile against the nominal ones. As a result, PacketScore can efficiently differentiate legitimate packets among suspicious traffic.
Third, packets might be discarded selectively by comparing the packet's score with a dynamic threshold, which is adjusted according to (1) the score distribution of all suspicious packets and (2) the congestion level of the victim.
In a PacketScore scheme, each arriving packet obtains a set of partial scores from a scorebook via a lookup operation, according to the attribute values it carries. The packet score—the sum of the packet's partial scores—is then compared to a dynamic threshold in an overload control unit. Packets whose scores are less than the threshold will be discarded.
A nominal profile is a set of baselines collected during a period in which the protected network was allegedly free of attacks. It characterizes the traffic within a certain period of time by measuring the average throughput in packets or bytes per second (used to rule an acceptable output packet rate), and by creating packet attributes normalized histograms. A measured profile has also this same structure, but characterizes the online traffic instead.
The comparison of both profiles provides PacketScore with enough parameters to distinguish legitimate packets from DDoS attacking packets with the use of a metric or score. The degree of disassociation existing between these profiles (the higher the disproportion, the higher the likelihood of an attack) provides packet differentiation.
The following attributes are currently measured on both profiles to generate the histograms: IP protocol-type values, packet sizes, Time-to-Live (TTL) values, Server port number, 16-bit source/destination IP address prefixes (as an approximation to the IP subnet calculation), TCP/IP header length, and TCP flag patterns.
Iceberg-style histograms (See, e.g., B. Babcock et al., “Models and Issues in DataStream Systems,” ACM Symp. on Principles of Database Sys., June 2002.), are used so that the nominal profile includes only the non-null attribute values (icebergs) that appear more frequently than a preset threshold, say x %. This keeps the profile to a manageable size, and reduces the lookup time. Iceberg-style histograms require two passes of input data to collect nominal profile data. A one-pass iceberg-style histogram maintenance/update is implemented efficiently in hardware by applying a two-stage pipelined approximation similar to what is proposed in R. M. Karp, C. H. Papadimitriou, and S. Shenker, “A Simple Algorithm for Finding Frequent Elements in Streams and Bags,” ACM Trans. on Database Systems, Volume 28, Issue 1, pp. 51-55, March 2003.
In this method, data processing is divided into periods where period t−1 scans for icebergs to be accounted in period t, which also scans for icebergs to be used in period t+1 and so on as in FIG. 6. FIG. 6 contains real attribute values and frequencies from the flag nominal profile, using a 1% threshold. Arriving packets in period t−1 possessed flag attribute values 2, 16, 17, 18, 20, and 24. These values (or icebergs) are accounted in period t, with the number of occurrences being 2335, 3850, 154, 88, 101, and 991, respectively. At the same time, in period t, arriving packets have flag attribute values 2, 16, 17, 19, 22, and 24, composing the icebergs to be accounted in period t+1.
Scoring is obtained as a direct comparison of nominal and measured profiles using CLP as a metric. After the scores are computed, it is necessary to calculate which score represents an upper-bound threshold that will distinguish legitimate packets from attacking ones, in a per packet/per-score basis. This chosen score will attend to throughput requirements, which regulate the output throughput, keeping it close to a target throughput previously set.
This overload control process is achieved by having a Cumulative Distribution Function (CDF) of all incoming packets created and maintained using one-pass quantile computation techniques as in (See, e.g., F. Chen, D. Lambert, and J. C. Pinheiro, “Incremental Quantile Estimation for Massive Tracking,” 6th International Conference in Knowledge Discovery and Data Mining, August 2000; and M. Greenwald and S. Khanna, “Space-Efficient Online Computation of Quantile Summaries,” In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, May 2001). Next, the discarding threshold (THd) is calculated (and dynamically adjusted) using the load-shedding algorithm as in (See, e.g., S. Kasera, J. Pinheiro, C. Loader, M. Karaul, A. Hari, and T. LaPorta, “Fast and Robust Signaling Overload Control,” Proceedings of 9th International Conference on Network Protocols (ICNP), November 2001).
According to this algorithm, the congestion level of the victim is measured, allowing the victim system to opportunistically accept more potentially-legitimate traffic as its capacity permits. As shown in FIG. 7, the resulting THd 705 is simply a discarding threshold associated to a corresponding drop rate. Incoming packets having packetscore are below the THd, 705 are discarded. The key idea here is to prioritize and drop packets based on their score values.
In the CLP-based scheme, a scorebook, a collection of each attribute value's score, is first generated based on Bayesian CLP. The score associated with each attribute value is obtained from two histograms; one is the currently measured and the other is the nominal profile. Implementation complexity arises from the calculation of these two histograms for each packet attribute.
It is very challenging to provide an effective overload control when a system is under fast-changing DDoS attacks. The previously proposed PacketScore scheme uses a CDF and a load-shedding algorithm to generate the discarding threshold THd. Packets with scores lower than the threshold are discarded. However, if an attacker changes its attack type and intensity, the THd—which was valid for a certain range of scores—would very likely become invalid, therefore compromising the differentiation capacity, until a more adequate THd is dynamically set. This situation tends to worsen as the scores of a measurement period are used in the next period, while the attacks continue to change.
It has been observed that the moment the attacks change, spikes of admitted traffic appear (due to the threshold invalidation explained above), sometimes lasting for a relatively large period of time. Even with frequent threshold updates in a small period of time, (the only way to revalidate the threshold), the CLP scheme still suffers from this problem. Thus, it would be useful to provide effective control even under fast-changing DDoS attacks.