This invention relates to a probabilistic distribution estimation apparatus, an abnormal behavior detection apparatus, a probabilistic distribution estimation method, and an abnormal behavior detection method and, in particular, to a probabilistic distribution estimation apparatus and an abnormal behavior detection apparatus for detecting abnormal behavior which is largely off whole behavior patterns and a probabilistic distribution estimation method thereof and an abnormal behavior detection method thereof.
In prior art, proposal has been made several abnormal behavior detection apparatuses in fields of statistics, data mining, masquerade or disguise detection, invasion detection, or the like.
At first, an apparatus for detecting abnormality on multidimensional data one-point by one-point is disclosed in UK Patent Application No. GB 2361336 A under the title of “Degree of outlier calculation device, and probability density estimation device and histogram calculation device for use therein.” According to GB 2361336 A, the apparatus represents the multidimensional data having discrete values or continuous values of one-point by one-point using a histogram or a probability density function to carry out detection of statistical outlier values.
Other several abnormal behavior detection apparatuses using behavior data represented by vector data having a discrete vector value have been proposed in fields of disguise detection, invasion detection, or the like as follows.
Invasion detection methods using system call data are described by S. Forrest, S. A. Hofmeyr, A. Somayaji, and T. A. Longstaff in Proceedings of the 1996 IEEE Symposium on Security and Privacy, pages 120-128, 1996, under the title of “A sense of self for UNIX Processes,” and by C. Warrender, S. Forrest, and B. Pearlmutter in Proceedings of the 1999 IEEE symposium on Security and Privacy, pages 133-145, 1999, under the title of “Detecting Intrusions Using System Calls: Alternative Date Models.” A method according to S. Forrest, S. A. Hofmeys, A. Somayaji, and T. A. Longstaff comprises the steps of storing a partial string of patterns in system calls where a particular program internally uses on normality, of matching a string of system calls in a running program with the partial string to determine whether or not the program is normal. In addition, a method according to C. Warrender, S. Forrest, and B. Pearlmutter comprises the steps of leaning a string of past system calls using a hidden Markov model (HMM) and of determining whether or not a running program is normal.
Furthermore, a masquerade detection method is described by R. A. Maxion and T. N. Townsend in Proceedings of the International Conference on Dependable Systems & Networks, pages 219-228, 2002, under the title of “Masquerade Detection Using Truncated Command Lines.” This method comprises the steps of leaning past records or history for commands of a specific user using a Naive Bayes model and of determining whether or not current behavior of the user is normal using obtained parameters.
An abnormal behavior detection method using an accessed log of Web is described by I. V. Cadez and P. S. Bradley in Proceedings of the Neural Information Processing Systems, pages 1345-1352, 2001, under the title of “Model Based Population Tracking and Automatic Detection of Distribution Changes.” This method detects a variation of whole behavior using accessed log data of a plurality of users.
In addition, a human abnormal behavior detection system through the image of a video camera is known in U.S. Pat. No. 6,212,510 issued to Matthew E. brand. This system estimates a behavior model using an entropic prior and a hidden Markov model.
On the other hand, abnormal behavior detection apparatuses using behavior data represented by continuous vector data are as follows.
A method for detecting change-points in time series data is described by K. Yamanishi and J. Takeuchi in Proceedings of KDD2002, pages 41-46, 2002, under the title of “A unifying Framework for Detecting outliers and change-points from non-stationary time series data.” This method comprises the steps of leaning time series data using an autoregression model or the line online and of detecting, as change-points, points where the model largely changes.
A method of finding a characteristic point in continuous time series data is described by X. Ge and P. Smyth in Proceeding of KDD2000, pages 81-90, 2000, under the title of “Deformable Markov Model Templates for Time-Series Pattern Matching.” This method comprises the steps of representing continuous time series data using a distribution model of a continuous time and a hidden Markov model having a regression model corresponding to each state and of detecting, as a characteristic point, the continuous time series data corresponding to a particular state.
In addition, a system for carrying out state estimation of trajectory data (continuous behavior data) is described by S. Gaffney and P. Smyth in Proceedings of KDD1999, pages 63-72, 1999, under the title of “Trajectory Clustering with Mixtures of Regression Models.” This system comprises state estimation means which leans trajectory data using a finite mixed distribution of regression models and calculates a certainty where the trajectory data arises from each regression model in the finite mixed distribution.
However, there are problems in the above-mentioned prior arts as follows.
A first problem is no adaptability for a variation of an information source for generating data in the prior arts. This is because all methods except for UK Patent Application No. GB 2361336 A and the method according to K. Yamanishi and J. Takeuchi cannot cope with when the pattern changes because all past data are equally dealt with.
A second problem is no sufficient scalability. This is because the method according to S. Forrest, S. A. Hofmeyr, A. Somayaji, and T. A. Longstaff requires a large scale of memory capacity to carry out detection at a high precision because using a matching. All methods except for UK Patent Application No. GB 2361336 A and the method according to K. Yamanishi and J. Takeuchi are inefficient on calculation as well as necessary of the large scale of memory capacity because a leaning algorithm uses all past data in there methods.
A third problem is no robustness for noises. This is because the method according to S. Forrest, S. A. Hofmeyr, A. Somayaji, and T. A. Longstaff determines abnormal for ones which are different from the stored partial string a little due to use of matching.
A fourth problem is that abnormal behavior enable to detect is restricted. This is because all of the method according to S. Forrest, S. A. Hofmeyr, A. Somayaji, and T. A. Longstaff, the method according to C. Warrender, S. Forrest, and B. Pearlmutter, and the method according to R. A. Maxion and T. N. Townsend which deal with the discrete data are methods specialized to problems, respectively, and cannot deal with problems such as occurrence of burst abnormal behavior, a plurality of programs, and a plurality of users although they can detect abnormal behavior in a sense of outlier which is largely off from past behavior in a single program or a single user. Similarly, the system according to U.S. Pat. No. 6,212,510 can only detect behavior in a sense of outlier from a learned model. The method according to I. Cadez and P. S. Bradley cannot detect a variation of individual behavior although the method can detect a variation of whole behavior in the problem for analyzing the accessed log in a plurality of users. The method according to X. Ge and P. Smyth dealing with the continuous data cannot detect an abnormal trajectory although the method can detect a characteristic point in the continuous time series data where it is understood that it should preliminarily be paid attention. The method according to S. Gaffney and P. Smyth cannot detect an abnormal trajectory although the method comprises trajectory state estimation means.
A fifth problem is that precision of detection is bad in a case of few data. The method according to S. Forrest, S. A. Hofmeyr, A. Somayaji, and T. A. Longstaff, the method according to C. Warrender, S. Forrest, and B. Pearlmutter, and the method according to R. A. Maxion and T. N. Townsend cannot detect abnormal behavior in the single program or the single user at a high precision when there is no sufficient amount of past data.
A sixth problem is that data of analysis target is restricted. The system according to UK Patent Application No. GB 2361336 A cannot detect abnormal behavior although the system can detect the discrete data or the continuous data one-point by one-point in a sense of outlier from the learned model. Likewise, the system according to K. Yamanishi and J. Takeuchi cannot detect abnormality in a pattern of behavior data although the system can detect outlier or a variation point in the discrete data or the continuous data one-point by one-point.