1. Field of the Invention
The present invention relates to computer intrusion detection systems, and more particularly to intrusion detection systems based on application monitoring to identify known and novel attacks on a computer system.
2. Background of the Invention
Intrusion detection systems (“IDSs”) generally take advantage of the extensive auditing capabilities inherent in many computer operating systems and other auditing facilities which may be added to a computer system. Such auditing systems can generate logs that capture every event that occurs during the operation of the computer, or may be configured to capture only a subset of information concerning specified events. The logs are examined, using manual or automated techniques, to identify possible intrusive activity. Most modern day operating systems provide the means for capturing a particular user's or process's instructions to the computer operating system, thereby allowing for full accountability for access to the computer's resources. Such captured instructions include, for example, all system calls made by an application, all object requests made by an application, and information related to the individual processes spawned by the application. Operating systems providing such auditing facilities are well-known in the art, and are commonly referred to as C2 systems. More information on auditing facilities and requirements can be found in Trusted Computer Systems Evaluation Criteria (also known as the “Orange Book”), published by the National Computer Security Center. For the Linux operating system and other variants of the well-known UNIX operating system, the well-known trace program logs system calls made by processes running on the computer, as well as the results of those system calls. For Sun Microsystem's well-known Solaris operating system, the Basic Security Module (“BSM”) produces an “event” record for individual processes, and can log over 200 events, including system instructions issued by each process. For Microsoft's well-known Windows NT operating system, base object auditing provides analogous auditing of a process's access to system resources. That is, whenever an object is requested or accessed by a process, the audit log records the transaction. Other operating systems provide similar audit capabilities. Even when such facilities are not integral to the operating system, a suitable auditing system for gathering data regarding a user's or application's use or interaction with the computer system's resources can be written and implemented according to techniques well-known in the art.
In the computer security arts, two basic strategies have been used in designing and implementing IDSs. In early systems, the basic strategy was to monitor the activities of the computer system's users to identify instances of intrusive user behavior. In such IDSs the goal was to identify user behavior indicating an attack on the system. Activities such as super user login attempts, transfers of sensitive files, or failed file access attempts were flags for potential intrusive activity. One example of such a user-oriented IDS is the Intrusion Detection Expert System (“IDES”) developed by Stanford Research International, as described by T. F. Lunt, “A Survey of Intrusion Detection Techniques,” in Computers and Security, Volume 12, 1993, pp. 405–418. Other examples are described by T. Lane and C. E. Brodley, “An Application of Machine Learning to Anomaly Detection,” in Proceedings of the 20th National Information Systems Security Conference, October, 1997, pages 366–377. Lane and Brodley first build user profiles based on sequences of each user's normal command executions, then they attempt to detect an intruder based on deviations from the user's established profile. Similarly, D. Endler, “Intrusion Detection: Applying Machine Learning to Solaris Audit Data,” in Proceedings of the 1998 Annual Computer Security Applications Conference (ACSAC '98), December, 1998, Scottsdale, Ariz., pp. 268–279, describes using neural networks to learn users' behavior based on Sun Solaris BSM events recorded from user actions. A drawback to such user-based IDS is that a user may slowly change his or her behavior to skew the profiling system such that intrusive behavior is deemed normal for that user. Moreover, user-based IDSs raise privacy concerns for users in that such a surveillance system monitors users' every move.
More recently, the focus has changed to monitoring the behavior of applications running on the computer. Such IDSs are based on the concept that every intrusion is, by definition, an unauthorized use or attempt to use the computer's resources using various computer applications. Application-based IDSs are described in more detail below. First, however, a brief discussion of general intrusion detection techniques is presented below.
In addition to being categorized according to the area of focus, i.e., user versus application, IDSs are also categorized according to the way intrusive behavior is identified. In one approach, the IDS analyzes computer audit logs looking for specific patterns corresponding to known attack signatures. This string-matching approach to intrusion detection is known as “misuse detection.” Misuse detection systems are described by J. Cannady, “Artificial Neural Networks for Misuse Detection,” in Proceedings of the 21st National Information Systems Security Conference, Oct. 5–8, 1998, pp. 443–456. An advantage of misuse detection systems is that such systems have a low false alarm rate. That is, if the system labels a behavior as intrusive, there is a high probability that an attack is present. Moreover, because a misuse IDS looks for known attacks, if an attack is detected, the exact nature of the attack is also identified. While misuse systems provide a fairly reliable way of detecting known attacks against systems, they can have a high false positive rate. That is, when even slight variations of known attacks are encountered, a misuse detection system will likely mislabel the behavior as normal. Unless an identical match to the previously stored signature is made, the attacker is likely to avoid detection. Because known attack signatures can be varied in countless ways, this makes detection of even known attacks a daunting problem. Moreover, a misuse detection approach cannot detect novel attacks against systems, of which there are new ones developed on a continual basis.
A second approach to identifying intrusive behavior is known as “anomaly detection.” In this approach, the normal operating characteristics of users or applications are observed to develop profiles reflective of normal behavior. The IDS then compares subsequent computer audit logs of user or application behavior with their associated profiles to determine whether or not the subsequent behavior has deviated from normal behavior. An advantage of an anomaly detection approach is the ability to detect novel attacks against the computer. However, a disadvantage of anomaly detection systems is their inability to identify the exact nature of the attack. An anomaly detection system can only detect that the behavior observed is unusual, such as might constitute an attack, but cannot identify the attack. Moreover, anomaly detection systems have been prone to excessive false positive identifications because any departure from normal operations is flagged as a possible attack, as discussed below.
In state-of-the-art anomaly detection systems, an equality matching (also referred to herein as “string-matching”) algorithm is used to identify anomalies. Equality matching algorithms compare, on a string-by-string basis, currently observed application behavior against a table of previously recorded normal behavior for that application. If a match is made for a string it is considered normal behavior. Otherwise an anomaly counter is increased. A drawback, however, to using an equality matching algorithm for intrusion detection is the inability to generalize from past observed behavior. That is, if the behavior currently observed during monitoring is not an exact match with the previously recorded behavior, then an anomaly is recorded. Equality matching techniques do not, by themselves, use any notion of similarity to determine if currently observed behavior is sufficiently close enough to previously recorded behavior to warrant not recording an anomaly. For this reason, equality matching anomaly detection systems have traditionally had a high false alarm rate. That is, they tend to send up false warnings of intrusions, therefore providing diminishing utility for the end user.
Two of the more prominent efforts in the prior art to solve these problems are summarized below. The IDSs described are application monitoring systems directed to anomaly detection based on known behavior of computer applications.
University of New Mexico
A research group at the University of New Mexico (“UNM”) implemented string-matching algorithms in a system capturing short sequences of system calls to build profiles of behavior for various applications, as described by S. Forrest, S. A. Hofmeyr, and A. Somayaji, “Computer Immunology,” in Communications of the ACM, Volume 40, No. 10, October, 1997, pp. 88–9, and by S. Forrest, S. A. Hofmeyr, A. Somayaji, and T. A. Longstaff, “A Sense of Self for Unix Processes,” in Proceedings of the 1996 IEEE Symposium on Security and Privacy, Oakland, Calif., May, 1996, pp. 120–128. The UNM group stored the profiles in tables representing the normal behavior for each application monitored. During online testing or deployment, short sequences of system calls made by each application were captured and compared with the associated table of normal behavior. If a particular sequence of system calls captured during the online operation of the application does not match any string in the application's associated table, an anomaly is recorded.
If the number of anomalies detected is a significant percentage of the overall number of short sequences captured during the online session, then the application's behavior is labeled intrusive. A problem with this technique is that actual intrusive behavior will tend to be washed out over time due to the occurrence of “noise.” Noise is caused by the normal variability to be expected in an application's behavior, and yet results in an anomaly being recorded. Noise tends to occur randomly throughout the application's execution, whereas actual intrusions result in concentrated occurrences of anomalies. Accordingly, a high percentage of noise during an application's execution can mask the intrusive behavior.
Iowa State University
A group from Iowa State University (“ISU”) has implemented an application-based intrusion detection system that analyzes system calls using state machine models of application behavior, as described in R. Sekar, Y. Cai, and M. Segal, “A Specification-Based Approach for Building Survivable Systems,” in Proceedings of the 21st National Information Systems Security Conference (NISSC '98), Oct. 5–8, 1998, pp. 338–347. However, their approach is not concerned with detecting anomalies, as much as detecting violations of specified behavior. As a result, the approach of the ISU group requires the development of specification models for acceptable program behavior. Unfortunately, deriving specification models by hand can be quite a difficult process and is not scalable to the number of programs that need to be specified.
As discussed previously, prior IDSs have employed neural networks to build user profiles based on sequences of commands entered by users. These user-based IDSs were implemented using feed-forward multi-layer perceptron networks, also known as “backpropogation neural networks” or simply “backprops.” The backprop is trained to recognize patterns of normal user behavior. After the training, the backprop is used to identify intruders based on deviations from the established user patterns. However, because of the complexities in establishing adequate learning criteria and other problems discussed herein, neural networks have heretofore not been implemented in application-based intrusion detection systems.