1. Field of the Invention
Embodiments of the present invention generally relate to data analysis and, more particularly, to a method and apparatus for normalizing and predicting time series data.
2. Description of the Related Art
Often times large sets of time series data require analysis. Time series data is data that is recorded regularly over time. For example, time series data may be the daily temperature changes of a city, or the daily changes in the stock price of a company. Data analysis can be used to predict data values. However, the quality of the time series data upon which the prediction is based, is crucial for producing accurate predictions.
Data may be contaminated if an error occurred when the data was being recorded. The error could have prevented the data from being properly recorded. It is also possible that the definition of the variable being recorded itself changed. For example, in the case of stock prices, a stock split may have taken place. The data before the stock split is not on the same scale as the data after the stock split. Before any analysis is performed or conclusions made based on the data, the data must first be corrected. Otherwise, one could easily infer or predict a sharply rising or falling trend where one does not actually exist, or one could infer changes in seasonality based on erroneous data.
Noise in data is meaningless data. Noise removal is a technique to correct a set of data containing errors. Conventional noise removal techniques such as median filtering or outlier rejection fail when the variations in the time series data are very large and the number of outliers may be the same or more than the number of inliers.
Therefore, there is a need in the art for a method and apparatus for normalizing and predicting time series data.