Time series databases, containing data captured over time, are commonly used in such areas as finance, meteorology, telecommunications, and manufacturing to keep track of data valuable to that particular area. For example, financial databases may track stock prices over time. Meteorological parameters such as the temperature over time are stored in scientific databases. Telecommunications and network databases include data derived from the usage of various networking resources over time such as the total number and duration of calls, number of bytes or electronic mails sent out from one ISP to another, amount of web traffic at a site, etc.; manufacturing databases include time series data such as the sale of a specific commodity over time.
Time series data depict trends in the captured data, which users may wish to analyze and understand. Users may wish to know, for a given time window, a trend of “typical” values or an “outlier” trend. Conversely, users may wish to find the time window in which most trends are as similar as possible or clustered. These similar trends are called “representative trends.” Representative trends may be used in lieu of the entire database for quick approximate reasoning. In addition, they can be used for prediction and for identifying and detecting anomalous behavior or intrusion.
By their very nature, time series databases tend to contain large amounts of data. As such, using representative trends of the data reduces the amount of data to be analyzed. However, the large amounts of data must first be processed in order to identify the representative trends.
There is a need in the art to identify representative trends efficiently and quickly in large amounts of data.