Publications and other reference materials referred to herein, including reference cited therein, are incorporated herein by reference in their entirety and are numerically referenced in the following text and respectively grouped in the appended Bibliography which immediately precedes the claims.
The infrastructure of a large Internet Service Provider (ISP) or Network Service Provider (NSP) typically comprises a constantly growing network of heterogeneous routers interconnecting millions of customer-devices. This network enables the network customers to exchange data of various formats created and consumed by a plethora of applications. Recent industry reports [1] suggest that customers obtain electronic threats (eThreats) mainly from the internet. eThreats comprise a variety of attacks which can be classified into three main categories: worm-related, non-worm related (e.g., virus, Trojan), and probes (e.g., spyware, adware, identity theft, and phishing).
While methods and technology for securing networks against intrusions continue to evolve, the basic problems are extremely challenging for a number of reasons. First, hackers who perpetrate intrusions continue to find ingenious ways to compromise remote hosts and frequently make their tools publicly available. Second, the size and complexity of the Internet, including end-host operating systems, make it likely that there will continue to be vulnerabilities for a long time to come. Third, sharing of information on intrusion activity between networks is complicated by privacy issues, and while there are certainly anecdotal reports of specific port scanning methods and attacks, there is very little broad understanding of intrusion activity on a global basis [2-5, 15, 16]. Because of these challenges, current best practices for Internet security rely heavily on word-of-mouth reports of new intrusions and security holes through entities such as CERT (www.cert.org) and DSHIELD (www.dshield.org).
During the first six months of 2004, the overall number of new Windows viruses and worms grew by 450% compared to the same period in 2003[1]. The average time between the announcement of a new vulnerability and the appearance of associated exploit code was 5.8 days. Once exploit code is made available, a new vulnerability can be widely scanned-for and exploited quickly. This means that, on average, customers have less than a week to patch all their systems on which the vulnerable application is running. The potential threat posed by a new vulnerability is worsened if the application in which the vulnerability is found is widely deployed, i.e., a Web server or database application. Recent widespread worms have illustrated the dangers of the narrow “vulnerability-to-exploit” window (e.g. Witty worm was discovered only two days after the vulnerability it exploited was made public). The ability of malicious code writers to rapidly upgrade bot (short for “robot”) networks, compounds the dangers posed by such a brief vulnerability-to-exploitation window. Furthermore, as worms are becoming more sophisticated and, in many cases, remotely controlled by attackers, the potential impact on enterprises and customers is significant. Once a new vulnerability is announced, organizations must introduce security countermeasures before an exploit is made available, or risk having their systems exploited.
In addition to the worm-related attacks which propagate in the network in various ways, other types of malicious codes are propagated manually and in many cases the malicious code is actually an unobtrusive information-gathering probe.
As a case in point, Trojans are increasingly being installed via malicious Web sites. They exploit browser vulnerabilities that allow malicious code authors to download and execute the Trojans with little or no conscious user interaction. Trojans appear to serve some useful purpose, which encourages users to download and run them, but actually carry a destructive function. They may masquerade as legitimate applications available for download from various sources or be sent to an unsuspecting user as an email attachment. Since Trojans do not replicate like viruses and worms (although they may be delivered by worms) they typically do not receive as much media attention. However, if they are executed on a computer they can be extremely destructive, with payloads ranging from unauthorized export of confidential data to surreptitious reformatting of hard drives.
The threatening situation described above has been amplified in part by increased global terrorism and criminal activities on the Web in recent years. Today the Web is used as an enabling platform for a plethora of illegal activities ranging from credit card fraud, through identity phishing, to transferring money and orders. Web application attacks are expected to increase in the near future; targeted attacks on firewalls, routers, and other security devices protecting users' systems will be a growing security concern; sophisticated methods of control and attack synchronization that are difficult to detect and locate will be used, and finally, more attempts to exploit mobile devices will be documented.
The eThreat posed to a NSP is especially significant because they are huge, service-oriented companies with tens of millions of customers, operating in an open networked environment which blends a plethora of technologies. This situation makes the NSP especially susceptible to eThreats propagated across networks. Thus, it calls for a significant investment in developing a comprehensive conceptual model that will enable the detection and prevention of both known and new forms of eThreats.
Many different types of defense mechanisms have been proposed for dealing with the above described eThreats. Among these mechanisms are the following:
Data Mining Approach: The Minnesota Intrusion Detection System (MINDS).
Data Mining has been used extensively in recent years as an enabling technology for intrusion detection applications [7, 8]. The overall goal for MINDS [9, 10] is to be a general framework and system for detecting attacks and threats to computer networks. Data generated from network traffic monitoring tends to have very high volume, dimensionality and heterogeneity. Coupled with the low frequency of occurrence of attacks, this makes standard data mining algorithms unsuitable for detecting attacks. In addition, cyber attacks may be launched from several different locations and targeted to many different destinations, thus creating a need to analyze network data from several locations/networks in order to detect these distributed attacks. The first step in MINDS includes constructing features that are used in the data mining analysis. Basic features include source IP address, source port, destination IP (internet protocol) address, destination port, protocol, flags, number of bytes, and number of packets. Derived features include time-window and connection-window based features. Time window based features are constructed to capture connections with similar characteristics in the last t seconds, since typically DOS and scanning attacks involve hundreds of connections. After the feature construction step, the known attack detection module is used to detect network connections that correspond to attacks for which the signatures are available, and then to remove them from further analysis. Next, the data is fed into the MINDS anomaly detection module that uses an outlier detection algorithm to assign an anomaly score to each network connection. A human analyst then has to look at only the most anomalous connections to determine if they are actual attacks or other interesting behavior. The MINDS association pattern analysis module summarizes network connections that are ranked highly anomalous in the anomaly detection module. The human analyst provides a feedback when analyzing created summaries of detected attacks and deciding whether these summaries are helpful in creating new rules that may be further used in the known attack detection module.
The Signature-Based Approach: Bloom Filters
Bloom filters [11-13] were used to build a system that scans Internet traffic. Packets enter the system and are processed by Internet Protocol (IP) wrappers. The data in the packet goes to the input buffer and then flows through the content pipeline. As the packet passes through the pipeline, multiple Bloom engines scan different window lengths for signatures of different lengths. Data leaves the content pipeline, flows to the output buffer, streams through the wrappers, and then packets are re-injected into the network. If a Bloom engine detects a match, a hash table is queried to determine if an exact match occurred. If the queried signature is an exact match, the malicious content can be blocked and an alert message is generated within a User Datagram Protocol (UDP) packet, informing a network administrator, an end-user or an automated process that a matching signature has been detected.
Dynamically Reconfigurable Hardware: Field Programmable Gate Arrays (FPGA)
A platform has been implemented that actively scans and filters Internet traffic for Internet worms and viruses at multi-Gigabit/second rates using the Field-programmable Port Extender (FPX) [17-21]. Modular components implemented with Field Programmable Gate Array (FPGA) logic on the FPX process packet headers and scan for signatures of malicious software (malware) carried in packet payloads. FPGA logic is used to implement circuits that track the state of Internet flows and search for regular expressions and fixed-strings that appear in the content of packets.
Sequential Hypothesis Testing and Credit-Based Connection Rate Limiting (CBCRL): a Worm Detection System
The application of mathematical modeling can be helpful for better defending systems against malware attacks [27, 30, 35].
Port Scanning Detection: The DIB:S/TRAFEN (The Dartmouth ICMP BCC: System Tracking and Fusion Engine)
Port scanning detection [26] is an effective technique for providing defense against port scanning attacks which attempt to discover communication channels which can be penetrated and exploited. As a case in point, the idea underlying DIB:S/TRAFEN [25] is that routers send “blind carbon copies” of ICMP (internet control message protocol) type 3 messages to a Collector who analyzes the messages, looking for signatures of worm scanning and correlating observations to track worm infections. The technique employs a simulator system capable of simulating worm infections and collecting the ICMP 3 messages in a tcpdump file for further analysis. For the collection of the ICMP destination unreachable messages, the system relies on Internet routers to forward copies of those messages that they generate to a central collector. From there, they are distributed to an array of analyzers that all report back to a Correlator system. The analyzers generate reports of significant behavior and create a set of identifying characteristics. Based on those characteristics the Correlator determines whether an active worm is propagating by comparing reports received from other analyzers. Information provided by the ICMP protocol has been employed by other security applications as well [23, 24].
Static Analysis of Executables: The SAFE System
SAFE is a virus detector resilient to code obfuscations [28]. To detect malicious patterns in executables, an abstract representation of the malicious code is built. The abstract representation is the “generalization” of the malicious code, e.g., it incorporates obfuscation transformations, such as superfluous changes in control flow and register reassignments. Similarly, an abstract representation of the executable in which one is trying to find a malicious pattern must be constructed. Once the generalization of the malicious code and the abstract representation of the executable are created, it is possible to detect the malicious code in the executable. The malicious code is generalized into an automaton with uninterrupted symbols. Uninterrupted symbols provide a generic way of representing data dependencies between variables without specifically referring to the storage location of each variable. A pattern-definition loader component takes a library of abstraction patterns and creates an internal representation. These abstraction patterns are used as alphabet symbols by the malicious code automaton. An executable loader component transforms the executable into an internal representation, here the collection of control flow graphs (CFGs), one for each program procedure. An annotator component inputs a CFG from the executable and the set of abstraction patterns and produces an annotated CFG, the abstract representation of a program procedure. The annotated CFG includes information that indicates where a specific abstraction pattern was found in the executable. The annotator runs for each procedure in the program, transforming each CFG. The detector component computes whether the malicious code (represented by the malicious code automaton) appears in the abstract representation of the executable (created by the annotator). This component uses an algorithm based upon language containment and unification. Another application of static code analyses for detecting buffer overflow attacks is described in [29].
Vulnerability Driven Network Filters: The Shields System
Software patching has not been effective as a first-line defense against large-scale worm attacks, even when patches have long been available for their corresponding vulnerabilities. Generally, people have been reluctant to patch their systems immediately, because patches are perceived to be unreliable and disruptive to apply. Shields [31-34] uses vulnerability-specific, exploit-generic network filters installed in end-systems once vulnerability is discovered, but before a patch is applied. These filters examine the incoming or outgoing traffic of vulnerable applications, and correct traffic that exploits vulnerabilities. Shields are less disruptive to install and uninstall, easier to test for bad side effects, and hence more reliable than traditional software patches. The architecture of Shields functions as follows: Whenever a new Shield policy arrives or an old policy is modified, the Policy Loader integrates the new policy with an existing specification (Spec) if one exists, or creates a new one otherwise. The Shield policy is expressed in the Shield policy language. Policy loading involves syntax parsing, and the resulting syntax tree is also stored in the Spec for the purpose of run-time interpretation of shielding actions. When raw bytes arrive at Shield from a port, an Application Dispatcher unit is invoked to determine which Spec to reference for the arrived data, based on the port number. The Application Dispatcher forwards the raw bytes and the identified Spec to a Session Dispatcher unit for event and session identification. On obtaining the locations of the session ID, message type, and message boundary marker from the corresponding Spec, the Session Dispatcher extracts multiple messages (if applicable), recognizes the event type and session ID, and then dispatches the event to the corresponding state machine instance. There is one state machine instance (SMI) per session. Given a newly-arrived event and the current state maintained by the corresponding session state, the SMI consults the Spec regarding which event handler to invoke. Then the SMI calls a Shield Interpreter unit to interpret the event handler. The Shield Interpreter interprets the event handler, which specifies how to parse the application-level protocol payload and examine it for exploits. It also carries out actions like packet-dropping, session tear-down, registering a newly-negotiated dynamic port with Shield, or setting the next state for the current SMI.
The academic literature reviewed above suggests that there exist a plethora of approaches, models and tools for addressing the problem of eThreats. Nevertheless, each initiative described above provides a partial solution to a very small part of a particular problem. They do not target all of the major facets of the eThreat protection challenge. Specifically: MINDS deals with threats to computer networks only and does not protect devices such as PCs, cell-phones, etc. It does not provide detection in real-time, and protection against polymorphism/metamorphism. FPGA and Bloom Filters provide a solution focused on throughput performance criteria, but can only deal with certain kinds of known eThreats that can be identified by their hashing or regular expression signature. Shields and the methods of Sequential Hypothesis Testing/credit-based connection rate limiting address only worm propagation, whereas SAFE addresses only virus threats. Finally, DIB:S/TRAFEN deals with only Port Scanning Detection.
All in all, it is clear that the above initiatives do not provide an overall satisfactory solution to the eThreat problem. The problem of eThreats has a dynamic nature, with new kinds of threats emerging and old threats evolving into different kinds of threats. For example, adware, spyware, and identity theft by way of phishing are “younger” threats compared to the virus threat and their impact has been felt substantially only in the last two to three years. Considering the fact that content on the Web cannot be effectively regulated, the eThreat challenge posed by crackers, terrorists, criminals, etc. is overwhelming.
It is therefore a purpose of the present invention to provide a system that offers a flexible and adaptive security platform against eThreats in NSP networks.
Further purposes and advantages of this invention will appear as the description proceeds.