Time series data can be generated and analyzed for a number of applications. For example, automatic equipment monitoring can avoid costly repairs. This can be clone by analyzing time series data acquired by sensors on or near the equipment to detect anomalies that may indicate that maintenance or repair of the equipment is needed.
As shown in FIGS. 1A and 1B for typical schochastic and multivariate time series data, most prior methods for detecting anomalies simply determine values outside of a normal operating range 101.
For examples, one method assumes that multivariate time series can be modeled locally as a vector autoregressive (AR) model, see Bay et al., “A Framework for Discovering Anomalous Regimes in Multivariate Time-Series Data with Local Models,” Technical Report, Center for the Study of Language and Information, Stanford University, 2004. That method first learns a distribution of AR model parameters for each time window of the training data. During testing, for each time window, the AR Model parameters are estimated and the probability of these parameters are determined from the previously learned probability distribution. The distribution learned by that method uses a restrictive autoregressive assumption. That method works well time series data with a random stochastic components.
Most methods focus on a single type of time series data for a particular application. There are very few methods that attempt a more general solution to the problem of anomaly detection in time series data.
U.S. Pat. No. 7,716,011, “Strategies for identifying anomalies in time-series data,” is a model-based approach that fits spline segments to time-series data, see FIG. 1C. That method computes changes in the spline parameters and uses L1 distances between the splines to detect anomalies, and L2 distances to measure normal operation. This method works well on time series with a smooth trajectory.