A predictive model (also known as a forecaster, a forecasting model, or an autoregressive model) is a software-implemented model of a system, process, or phenomenon, usable to forecast a value, output, or outcome expected from the system, process, or phenomenon. The system, process, or phenomenon that is modeled is collectively and interchangeably referred to hereinafter as a “process” unless specifically distinguished where used.
A simulation is a method of computationally looking ahead in the future of the execution of the process to predict one or more events that can be expected to occur in the process at that future time. A predicted event is a value, output, or outcome of the process at the end of a look-ahead period configured in the simulation.
A variable that affects an outcome of a process is called a factor or a feature. A predicted event or an outcome of a process is dependent upon, affected by, or otherwise influenced by a set of one or more factors. A factor can be independent, to wit, independent of and not affected by other factors participating in a given model. A factor can be dependent upon a combination of one or more other independent or dependent factors.
A predictive model has to be trained before the model can reliably predict an event in the future of the process with a specified degree of probability or confidence. Usually, but not necessarily, the training data includes past or historical outcomes of the process. The training process adjusts a set of one or more parameters of the model.
A predictive model can also self-train using a machine learning process. The predictive model selects some of its own prior outputs depending upon some combination of the validity, accuracy, repeatability, and reliability of those prior outputs. The predictive model then consumes the selected prior outputs as training inputs, to improve some combination of the validity, accuracy, repeatability, and reliability of future outputs.
Data emitted over a period by a data source is called a time-series. In statistics, signal processing, and many other fields, a time-series is a sequence of data points, measured typically at successive times, spaced according to uniform time intervals, other periodicity, or other triggers.
Time-series analysis is a method of analyzing time-series, for example to understand the underlying context of the data points, such as where they came from or what generated them. As another example, time-series analysis may analyze a time-series to make forecasts or predictions. Time-series forecasting is the use of a forecasting model to forecast future events based on known past events, to wit, to forecast future data points before they are measured. An example in econometrics is the opening price of a share of stock based on the stock's past performance, which uses time-series forecasting analytics.
Time-series forecasting uses one or more forecasting models to regress on independent factors to produce a dependent factor. For example, if Tiger Woods has been playing golf very quickly, the speed of play is an example of an independent factor. A forecasting model regresses on historical data to predict the future play rates. The future play rate is a dependent factor.
Time-series data is not always uniformly distributed and often includes anomalies. For example, if the data pertains to a golfing tournament, the events that occur in the tournament are reflected in the data. The type, spacing, peaking, repetition rate, intensity, duration, and other characteristics of the events are dependent on a variety of factors, and are therefore non-uniformly distributed in the data. A state-based forecasting model accounts for the anomalies in an input time-series when producing a forecast.
The non-uniformity of the distribution of an event in time-series data is referred to herein as an anomaly. For example, that an event in the example golfing data will have a certain value is dependent upon a time of day when that event is occurring, the slope of the course, a weather condition at the time, a skill level of the player, and many other factors that introduce anomalies in the event's data. For example, the event may occur more regularly during midday as compared to evenings; or the event may occur more predictably if a skilled player is playing as compared to when a novice is playing; and so on.