There exist numerous applications in which real time data analysis may be required. For example, data events may be collected in a financial setting to identify potentially fraudulent activity, in a network setting to track network usage, in a business setting to identify business opportunities or problems, etc. Often, it may be necessary to examine individual data events as they occur to immediately investigate any suspect behavior. Challenges however arise when analyzing data events in real time since historical data values are typically necessary to identify trends and patterns. Namely, accessing historical data can be a relatively slow process, and thus limits real time processing.
There exist various known techniques (e.g., running estimates, moving windows, etc.) for analyzing data events in real time (or near real time). In such techniques, the historical data is essentially “built in” to the currently calculated estimate, thus providing a statistical summary in a single value. Such techniques utilize little or no historical data to provide a statistical analysis of detected event values. Instead, they, e.g., maintain a running value, which is updated each time a new data event value is collected.
In some applications, such as those analyzed with a running median, comparing new values to the existing running median value may provide only limited information upon which a business analysis may occur. One approach is to calculate a standard deviation to provide more insightful analysis. However, because the overall distribution of the data events might not be Gaussian in nature, the standard deviation may be also subject to limitations.