Technical Field
The present invention relates to data processing, and more particularly to extracting interpretable features for classification of multivariate time series from physical systems.
Description of the Related Art
Previous approaches for time series classification, clustering and signature extraction have focused primarily on the univariate case, i.e. when each instance includes a single time series. These can be organized into the following two themes: global; and local. Global techniques consider the entire time series data at once and extract either similarity based features (e.g., Euclidean distance or Dynamic Time Warping distance between two time series) or interval based features (e.g., the mean, variance, minimum or maximum value over a sliding time window). These features are then used as input to standard classifiers such as Support Vector Machines (SVMs), decision trees, and so forth or to clustering algorithms like K-means. Local techniques aim to extract subsequences of the original time series as features. These subsequences are called shapelets because they correspond to shapes embedded within a large time series that are useful for discriminating between univariate time series from different classes or as a similarity measure between time series for clustering.
Univariate shapelets have been used for early classification of time series. The main idea here is to balance the discriminative power of a shapelet against when it occurs in time series data collected during an online setting with preference for shapelets that occur early in time series data.
With regard to event extraction and event pattern mining from time series, traditional approaches have relied on change point detection based approaches to define events, and then use standard frequent and sequential pattern mining algorithms to extract event patterns. Such approaches only work for instantaneous events that occur at a single time point. Recent work has extracted events that occur over an interval from time series data. These interval events are extracted using time series values, e.g., a “high” (“low”) event occurs when values are above (below) a threshold, linearly increasing value event, and so forth.