1. Field of the Invention
The present invention relates to a degree of outlier calculation device, and a probability density estimation device and a histogram calculation device for use therein and, more particularly, to statistical outlier detection, fraud detection and fraud detection techniques for detecting an abnormal value or an outlier which largely deviates from data patterns obtained so far from multi-dimensional time series data.
2. Description of the Related Art
Such a degree of outlier calculation device is for use in finding an abnormal value or an outlier which largely deviates from data patterns obtained so far from multi-dimensional time series data and is employed, for example, in a case of finding such fraud behavior as so-called cloning use from a record of cellular phone services and in a case of finding abnormal transaction from a use history of a credit card.
Well-known conventional fraud detection methods using a machine learning technique include the method by T. Fawcett and F Provost (“Combining Data Mining and Machine Learning for Effective Fraud Detection, Proceedings of AI Approaches to Fraud Detection and Risk Management, pp. 14-19, 1997”) and the method by J. Ryan, M. Lin and R. Miikkulainen (“Intrusion Detection with Neural Networks, Proceedings of AI Approaches to Fraud Detection and Risk Management, pp. 72-77, 1997”).
Among the above methods, one that makes use of an idea of statistical outlier detection, in particular, is the method by P. Burge and J. Shawe-Taylor (“Detecting Cellular Fraud Using Adaptive Prototypes, Proceedings of AI Approaches to Fraud Detection and Risk Management, pp. 9-13, 1997”).
As a learning algorithm for a parametric finite mixture model, well-known is the EM Algorithm by A. P. Dempster, N. M Laird and D. B. Ribin (“Maximum Likelihood from Incomplete Data via the EM Algorithm, Journal of the Royal Statistical Society, B, 39(1), pp. 1-38, 1977”).
As a learning algorithm for a normal kernel mixture distribution (a mixture of a finite number of the same normal distributions), the prototype updating algorithm by I. Grabec is known (“Self-Organization of Neurons Described by the Maximum-Entropy Principle, Biological Cybernetics, vol. 63, pp. 403-409, 1990”).
The above-described methods by T. Fawcett and F. Provost and by J. Ryan, M. Lin and R. Miikkulainen relate to fraud detection realized by learning unfair detection patterns from data whose fraud is known (so-called supervised data). In practice, however, it is so difficult to obtain sufficient unfair data that highly precise learning can not be conducted to result in a decrease in fraud detection precision.
The method by P. Burge and J. Shawe-Taylor relates to similar fraud detection based on unsupervised data. This method, however, conducts fraud detection with two non-parametric models, a short-term model and a long-term model, to make a distance between them as a criterion for an outlier. Statistical basis of the short-term model and the long-term model is insufficient to make statistical significance of a distance therebetween unclear.
In addition, preparation of two models, short-term and long-term models, deteriorates calculation efficiency. Further problems are involved such as a problem that only continuous value data can be handled and not categorical data and a problem that since only non-parametric models are handled, fraud detection is unstable and inefficient.
Although as a learning algorithm for a statistical model, the EM algorithm by A. P. Dempster, N. M. Laird and D. B. Ribin and the prototype updating algorithm by I. Grabec are known, since these algorithms learn from all the past data equally weighted, they fail to cope with a pattern change.