Intrusion into a system such as an information system can be defined as one or more unauthorized activities that violate the security policy applicable to the system. The detection of an intrusion is the act of tracing those unauthorized activities (or users) in the system. Intrusion detection relies on the belief that an intruder's behavior will be noticeably different from that of a legitimate user and that unauthorized actions, therefore, are detectable. Thus, intrusion detection should provide an in-depth defense against intrusion into the system by checking and rechecking the effectiveness of other access control mechanisms of the system.
The main goal of intrusion detection is to effectively monitor the events occurring in a host machine or network for signs of intrusion and to report the signs of intrusion to a system administrator so that the system administrator can take appropriate remedial and/or preventative actions.
Generally, the detection of intrusions can be classified into two categories, misuse detection and anomaly detection, depending on how the monitored data is evaluated. In misuse detection, information about previous attacks is used to generate attack signatures that can be compared to current activity data in order to determine if the current activity data indicates an intrusion. In anomaly detection, the normal behavior of the system is learned, and any activity that strongly deviates from the learned normal behavioral profile is considered an intrusion.
One of the problems with anomaly intrusion detection is that it is difficult to learn intrusion behavior from discrete data. Unfortunately, the success of an intrusion detection is mainly dependent on how efficiently the audited data can be analyzed for traces of intrusion.
An instance based learning model can be used to classify query data (i.e., query instance) according to the relationship between the query instance and stored exemplar instances. Instance based learning requires a notion of how the similarity between two discrete data sequences can be measured in order to classify the query instance.
The similarity measure proposed by Lane and Brodley in “Temporal Sequence Learning and Data Reduction for Anomaly Detection,” Proceedings of the 5th Conference on Computer and Communication Security, ACM Press, New York, N.Y., is a useful similarity metric. According to this similarity metric, the similarity between two discrete valued sequences X and Y of fixed length n defined as X=(x0, x1, . . . , xn−1) and Y=(y0, y1, . . . , yn−1) is given by the following pair of functions:
      W    ⁡          (              X        ,        Y        ,        k            )        =      {                                                      0                                                                        if                  ⁢                                                                          ⁢                  k                                <                                  0                  ⁢                                                                          ⁢                  or                  ⁢                                                                          ⁢                                      x                    k                                                  ≠                                  y                  k                                                                                                        1                +                                  W                  ⁡                                      (                                          X                      ,                      Y                      ,                                              k                        -                        1                                                              )                                                                                                                        if                  ⁢                                                                          ⁢                                      x                    k                                                  =                                  y                  k                                                                    ⁢                                  ⁢        and        ⁢                                  ⁢                  SIM          ⁡                      (                          X              ,              Y                        )                              =                        ∑                      k            =            0                                n            -            1                          ⁢                  W          ⁡                      (                          X              ,              Y              ,              k                        )                              
As can be seen from the above functions, the similarity score between two instances X and Y that are exactly the same is a maximum and has a value of n(n+1)/2. This maximum similarity score is denoted Simmax. A lower bound on the similarity score when there is exactly one unmatched position between any pair of instances X and Y is given by the following function:
      Lb    n    1    =      {                                                      (                              ⌈                                                      n                    -                    1                                    2                                ⌉                            )                        2                                                if            ⁢                                                  ⁢            n            ⁢                                                  ⁢            is            ⁢                                                  ⁢            even                                                                                          n                2                            -              1                        4                                                if            ⁢                                                  ⁢            n            ⁢                                                  ⁢            is            ⁢                                                  ⁢            odd                              
The converse measurement, i.e., distance, between the sequences X and Y is given by Dist(X,Y)=Simmax−Sim(X,Y).
In the context of anomaly detection, user behavior or system behavior is profiled. However, these behavioral profiles can, potentially, grow without bound. Therefore, data reduction is important because the size of the profile directly impacts the time required for classification of a test instance as normal or an anomaly. The behavioral profile of the user/network is required to be present in main memory for real time detection of intrusive activities to be possible. Accordingly, a major challenge in designing an intrusion detection system is to make sure that these behavioral profiles do not consume huge amounts of space in the primary memory, or otherwise normal activities of the user/network will be impaired.
The present invention is directed to an intrusion detection system that detects anomalies and that addresses one or more of these or other problems.