The present invention relates to a method and apparatus for deriving a statistical measure of variation from a decaying mean, and in particular to their use in telecommunications and anomaly detection applications and a system incorporating the same.
In recent years there has been a rapid increase in the number of commercially operated telecommunications networks in general and in particular wireless telecommunication networks. Associated with this proliferation of networks is a rise in fraudulent use of such networks the fraud typically taking the form of gaining illicit access to the network, and then using the network in such a way that the fraudulent user hopes subsequently to avoid paying for the resources used. This may for example involve misuse of a third party""s account on the network so that the perpetrated fraud becomes apparent only when the third party is charged for resources which he did not use.
Since fraudulent use of single account can cost a network operator a large sum of money within a short space of time it is important that the operator be able to identify and deal with the most costly forms of fraud at the earliest possible time.
One of the steps employed in, but not limited to use in, such fraud detection systems in anomaly detection from event streams.
Pattern recognition for event streams can be achieved by building up profiles of the behaviour of an entity and performing anomaly detection over these profiles. Such profiles may contain statistical information including but not restricted to an average of event values (for example mean) and a measure of the statistical variation from that average (for example variance or standard deviation). It is then possible to compare a newly received event value with the average and a measure of the typical variation from that average and to decide on that basis whether the newly received event value is or is not anomalous. For example in a telephone network the event data may relate to the number of minutes of telephone calls made in a given period (for example one day). In the case of a domestic subscriber who typically makes an average of 10 minutes of calls per day, sudden call record data of 300 or 400 minutes in one day may be indicative not only of an anomaly but of fraudulent use of the telephone system. In such a case it may be appropriate to raise an alarm only where the recently received value lies more than some multiple of standard deviations from the mean. In practice it is impractical to retain all historic values of events. However given a situation where the mean, xcexc, of nxe2x88x921 values is known, then this measure can be updated given an additional data value, vn, to give a new mean as defined in equation [1]. This provides an exact value for the new mean.
xcexcn=(1xe2x88x921/n)xcexcnxe2x88x921+(1/n)(Vnxe2x88x92xcexcnxe2x88x921)xe2x80x83xe2x80x83[1]
The calculation of the true statistical variance of the data cannot be made exactly however if the previous values are not retained. This is because all previous deviations from the mean must be recalculated when the mean changes and this cannot be done if previous values have not been retained. However, it is possible to derive approximations to the variance and estimations of the variance. A first approximation to the variance (S) can be made by updating the expression in a manner analogous to the mean update equation [1]. This method simply ignores the strict necessity to recompute all values and treats the previous deviation measure as though it were a mean deviation. This can then be updated using the equation [2].
Sn=(1xe2x88x921/n)Snxe2x88x921+(1/n)(Vnxe2x88x92xcexcnxe2x88x921)2xe2x80x83xe2x80x83[2]
An alternative approximation which includes a correction for the recalculation of the previous variance is defined in equation [3]. This is a known equation for variance estimation that is used for time series data. This provides a closer approximation to the true variance in the case where n is known.
Sn=(1xe2x88x921/n)Snxe2x88x921+(1/nxe2x88x921n/2)(Vnxe2x88x92xcexcnxe2x88x921)2
The field of variance estimation has been extensively studied. The technique of Kalman Filtering, widely used in the analysis of time series data, employs a similar method for co-variance estimation.
A disadvantage of using the conventional statistical averages and measures of variance such as mean and standard deviation is that all input data values have equal influence on the resulting measures. In situations where the event data may be locally stable but vary significantly over longer time scales (e.g. telephone account usage patterns), it is undesirable that older data values relating to prior (pseudo-)stable states should retain equal influence in measures to be applied to the current (pseudo-)stable state.
This can be dealt with for conventional statistical calculations by selecting a time period and calculating the mean and variance over the period specified. This period can then serve as a moving window for the calculation of statistical measures. However, this method requires that all data values be stored for accurate updating and that a window of appropriate size can be determined. In order to provide a measure of variation that is usable for large multi-dimensional datasets an appropriate method of variance estimation based on the update formulae described must be found
The invention seeks to provide an improved method and apparatus for deriving a statistical measure of variation from a decaying mean.
The invention also seeks to provide an improved method and apparatus for anomaly detection in data streams in general, and for anomaly detection in data streams relating to telecommunications account data in particular.
The invention provides an application of an adaptation of the calculation of standard deviation outlined below. It results in a specific mathematical formula for maintaining a sequential deviation measure. The method extends to allow for calculation of deviation to be itself decayed where no events of a given type occur in the event stream. This is the same as zero value events occurring and a formula that provides an approximate calculation for this is also provided.
According to a first aspect of the present invention there is provided a method of detecting anomalies in a stream of data values comprising the steps of: receiving a data value on said stream of data; calculating a new weighted average responsive to said data value, a previously stored weighted average associated with said stream of data, and a decay rate in the range of 0 to 1; and calculating a new measure of deviation from said new weighted average responsive to said new weighted average, said data value, a previously stored measure of deviation associated with said stream of data, and said decay rate; storing said new weighted average and said new measure of deviation.
In one preferred embodiment the method additionally comprises the steps of: determining an anomaly threshold responsive to said previously stored weighted average and a previously stored measure of deviation; deciding whether said data value is anomalous responsive to a comparison between said data value and said anomaly threshold
Preferably, said anomaly threshold is a sum of said previously stored weighted average and a multiple of said previously stored measure of deviation therefrom.
Preferably, said multiple is in the range 2 to 10.
Preferably, said new weighted average is a sum of a product of said decay rate and said previously stored weighted average and a product of one minus said decay rate and said data value.
In a preferred embodiment, said new weighted average is d.v+(1xe2x88x92d) h wherein d is said decay rate, v is said data value, and h is said previously stored weighted average.
In one preferred embodiment, said decay rate has a half-life and said measure of deviation is calculated responsive to an approximation to said half-life.
Preferably, said half-life is determined by (1xe2x88x92d)xcex=0.5 wherein d is said decay rate.
Preferably, said new measure of deviation is   DV  +      (                                        (                          v              -              h                        )                    2                -        DV                    2        ·        λ              )  
wherein
DV is said previously stored measure of deviation, v is said value, h is said new weighted average, and xcex is said half life.
In one preferred embodiment said decay rate is less than 0.1.
In a preferred embodiment, said data value relates to subscriber account usage.
In a preferred embodiment, an anomalous data value is indicative of account usage fraud.
In a preferred embodiment, said subscriber account is a telecommunications network subscriber account.
In a preferred embodiment, said telecommunications network is a wireless network.
In one preferred embodiment, successive data values relate to uniform-length time periods.
In a further preferred embodiment, successive data values relate to non-uniform-length time periods.
Preferably said new weighed average and said new measure of deviation are calculated responsive to a measure of a time period associated with said data value
Advantageously, the method gives better tracking of slow changes in behaviour over time than does the standard measures of mean and standard deviation.
Advantageously, the method minimises calculation steps involved at each stage and obviates storing all past values for calculating the profile value.
According to a further aspect of the present invention there is provided a system for detecting anomalies in a stream of data values, comprising: a processor arranged to receive a data value from said stream of data values; to calculate a new weighted average responsive to said data value, a previously stored weighted average associated with said stream of data, and a decay rate in the range 0 to 1; and to calculate a new measure of deviation from said new weighted average responsive to said new weighted average, said data value, a previously stored measure of deviation associated with said streams of data, and said decay rate; and a storage device upon which to store said previously stored weighted averaged and said previously stored measure of deviation.
The present invention also provides for a telecommunications system comprising such a system for anomaly detection. In a particularly appropriate arrangement, the telecommunications system is a wireless telecommunications system.
The invention also provides for a system for the purposes of digital signal processing which comprises one or more instances of apparatus embodying the present invention, together with other additional apparatus.
The invention also provides for a program for a computer on a machine-readable medium arranged to perform the steps of the method in any of its embodiments.
In particular, there is provided a program for a computer on a machine-readable medium arranged to perform the steps of: receiving a data value on said stream of data; calculating a new weighted average responsive to said data value, a previously stored weighted average associated with said stream of data, and a decay rate in the range of 0 to 1; and calculating a new measure of deviation from said new weighted average responsive to said new weighted average, said data value, a previously stored measure of deviation associated with said stream of data, and said decay rate.
The preferred features may be combined as appropriate, as would be apparent to a skilled person, and may be combined with any of the aspects of the invention.