Forecasting Time Series
A time series is a sequence of observations that are ordered in time (e.g., observations made at evenly spaced time intervals). Some examples of time series data may include end-of-month stock prices for General Electric, hits per day at a web site, the volume of usage of a communications network, weekly sales of Windows XP, electrical demand per hour in Seattle, daily high tide readings in San Francisco Bay, etc.
Various forecasting methods exist that attempt to predict future values of the time series based on the past time series data. Some forecasting methods are as simple as continuing the trend curve smoothly by a straight line. Other forecasting methods are more sophisticated. The most well known and widely used method is the ARIMA procedure (auto-regression and moving averages) due to Box and Jenkins (George E. P. Box, et al., “Time Series Analysis: Forecasting And Control,” 3rd Edition, Prentice Hall, Feb. 9, 1994), a procedure which assumes that each measurement in a time series is generated by a linear combination of past measurements plus noise.
Statistical methods (Gilchrist W., Statistical Forecasting, John Wiley & Sons; December 1976) related to solving forecast problems include Taylor Series Exponential Smoothing, Decision Trees, Neural Network and Heuristic Networks.
There are a number of publications describing time series forecasting as applied to a number of problems. Thus, time series have proven useful for forecasting usage of network resources (see U.S. Pat. No. 5,884,037, U.S. Pat. No. 6,125,105, US 2001/0013008 the disclosures of which are incorporated herein by reference). Another disclosure providing potentially relevant background information is Wolski R., Dynamically forecasting network performance using the Network Weather Service, (Cluster Computing, vol., 1, num. 1, pp. 119-132, 1998).
Other applications of time series forecasting include the forecasting of glucose concentration (see U.S. Pat. No. 6,272,364 and U.S. Pat. No. 6,546,269 the disclosures of which are incorporated herein by reference), and the forecasting of macroeconomic data (see Clements M., and Hendry D., “Forecasting Economic Time Series”, Cambridge University Press, 1998)
One known technique for improving forecast quality is to use a multiple forecasting model which combines forecasts obtained from a plurality of different forecasting models. For example, U.S. Pat. No. 6,535,817, the disclosure of which is incorporated herein by reference, discloses methods, systems and computer program products for generating weather forecasts from a multi-model superensemble. In particular, U.S. Pat. No. 6,535,817 and T. N. Krishnamurti et al. “Improved Weather and Seasonal Climate Forecasts from Multimodel Superensemble”, Science, vol. 285 No. 5433, pp 1548-1550, Sep. 3, 1999 disclose the generation of a model that combines the historical performance of forecasting data from multiple weather forecasting models over a large number of geographic areas or regions to produce an unifying forecast.
U.S. Pat. No. 6,032,125 discloses the use of a plurality of neural networks to forecast the sales of products.
Other disclosures providing potentially relevant background material and related to combining forecasts include:                Cesa-Bianchi N., et al. “How to use expert advice.” Journal of the Association for Computing Machinery, Vol. 44, No. 3, pp. 427-485, May 1997;        Clemen R. T., “Combining Forecasts”, International Journal of Forecasting, No. 5, pp 559-583, 1989;        Clements M., and Hendry D., Forecasting Economic Time Series, Cambridge University Press, 1998;        Herbster M., and Warmuth M. “Tracking the best expert.” In Proceedings of the Twelfth International Conference on Machine Learning, pages 286-294, 1995        Opitz D. W., and Shavlik I W., “Generating Accurate and Diverse Members of a Neural-Network Ensemble,” Advances in Neural Information Processing Systems, vol. 8, The MIT Press, pp. 535-541, 1996.        Krogh A., and Vedelsby J., “Neural Network Ensembles, Cross Validation, and Active Learning,” Advances in Neural Information Processing Systems, vol. 7, The MIT Press, pp. 231-238, 1995;        Thompson, P. D. “How to improve accuracy by combining independent forecasts” Mon. Wea. Rev., 105, 228-229, 1977.        
In general, there are a number of difficulties which need to be overcome when combining forecasting models. For applications where limited computational resources is a factor, care must be taken to avoid or minimize redundancy between forecasting models. Furthermore, it is not always clear a priori how much weight to assign to each of the constitutive forecast models. In situations where there are numerous models to be combined and numerous time series to be forecast the computational costs associated with building appropriate forecasting model for each time series can be prohibitive. Finally, it is noted that issues associated with model overfitting can be difficult to treat appropriately in a combined model.
An important application of time series forecasting methods is the prediction of the future consumption of a commodity from historical consumption data. Exemplary commodities include electricity, natural resources, network bandwidth, and money spent in a retail store.
Forecasting Commodity Consumption by Individuals
Although techniques for forecasting consumption of commodities by an entire population have been disclosed, the more difficult task of predicting future consumption of commodities individual consumers within a population of consumers remains an open problem. While the former problem addresses the question “how much of the commodity in total will be consumed at a certain time” the latter problem attempts to forecast “how much will each specific consumer within the large population consume at a certain time.”
The differences between these two problems are substantial. Thus, for many applications the number of commodities to be forecasted is typically small, and the number of consumers is typically large. Because certain consumers exhibit irregular consumption habits, the forecasting of commodity consumption by individual consumers is prone either to overfitting or to large inexactitudes. Furthermore, it is noted that in the interests of building an accurate model, it is, for many applications, desirable to combine forecasting models. For the specific case where a forecast model is needed for a large number of consumers, this can be computationally expensive if specific forecasting models for each individual consumer are combined.
There is an ongoing need for models, systems and computer readable code for forecasting commodity consumptions of individuals within large, non-homogenous populations. One exemplary commercial application relates to comparing predicted future commodity consumption values with actual commodity consumption values. In one specific example, an individual customer consumes wireless telephone services from several providers. Should the individual under-consume the telephone services as compared to the forecast consumption, it could indicate that the consumer has switched to another provider. According to this example, it could be advantageous to offer a discount in order to stimulate consumption of the wireless service by the individual customer.