For many kinds of business and scientific applications, the ability to generate accurate forecasts of future values of various measures (e.g., retail sales, or demands for various types of goods and products) based on previously collected data is a critical requirement. The previously collected data often consists of a sequence of observations called a “time series” or a “time series data set” obtained at respective points in time, with values of the same collection of one or more variables obtained for each point in time (such as the daily sales generated at an Internet-based retailer). Time series data sets are used in a variety of application domains, including for example weather forecasting, finance, econometrics, medicine, control engineering, astronomy and the like.
The process of identifying a forecasting model for a time series often includes fitting certain structured time series models (or combinations of such models), e.g., autoregressive models, moving average models, periodic/seasonal models, or regression models. Often, a particular modeling/forecasting methodology is selected, and several different specific models that use the same methodology or have the same general structure (but differ from each other in one or more model parameters) are then generated. One is then faced with the problem of selecting a particular model of the family as the best or optimal model.
For time series models, it is common to use a metric such as the value of maximum likelihood, obtained at the fitted estimates of the parameters, to compare the various models of the family. The optimal model is then selected using the “best” value of the metric (where the definition of “best” may vary depending on the metric selected). However, this approach is not always reliable. If log likelihood is being used as the metric, for example, more complex models in the model family (where model complexity is measured in terms of the number of parameters in the model) will often tend to have a higher log likelihood score. However, such complex models, while appearing superior on the basis of the metric selected, may sometimes perform relatively poorly with respect to forecasting accuracy, e.g., due to over-fitting on the training data. A number of ad hoc approaches to the problem of model selection in the time series context have been devised (e.g., by adjusting log likelihood values to penalize more complex models), typically without yielding a sound general solution.
While embodiments are described herein by way of example for several embodiments and illustrative drawings, those skilled in the art will recognize that embodiments are not limited to the embodiments or drawings described. It should be understood, that the drawings and detailed description thereto are not intended to limit embodiments to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope as defined by the appended claims. The headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the description or the claims. As used throughout this application, the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words “include,” “including,” and “includes” mean including, but not limited to.