An analytic model includes an analytic component (with associated metadata containing information such as a description of the analytic technique used, assumptions required for the analytic technique to be valid, constraints and sensitivities), the definition of the type of data on which the model operates, and a definition of the output the model produces. The analytical component can be built using a variety of techniques, including, but not limited to, mathematical modeling, statistical modeling and data mining. Mathematical modeling can be used when the problem is sufficiently well understood (being expressed, for example, as a solution to the equations used to describe changes in a system) and often requires only a modest amount of input data. Statistical modeling is used when the general mathematical structure can be hypothesized using domain knowledge and analysis of moderate amounts of data. Data mining is used when the mathematical structure is completely unknown and a large amount of data is required to infer both structure and parameters for the model.
A model instance is the execution of a model on a particular input set and producing an output based on those inputs. For any model, it may have hundreds of model instances depending on the frequency with which the model is executed. How long the output of a model instance is considered valid is dependent on a number of factors, included, but not limited to the frequency with which the input data changes and the amount of quantitative change in the input data. If the analytic component of a model is revised, then a new version of the model is said to be created. Model instances for this (new) model are generated when the new version is executed.
Generally in the current practice, the inputs and assumptions upon which a model is defined, both within an enterprise and across enterprises are not recorded and maintained in a coherent fashion. Often, who is permitted to make changes to an analytic component and when changes are permitted is not strictly enforced or tracked. Moreover, the data and metadata used in the execution of a model that creates a particular model instance are not recorded and maintained with the instance. This lack of provenance tracking can lead to incorrect decision making or as time passes, good decisions turning into incorrect ones because analysis is not updated as baseline assumptions or input data that drove the original results of a particular model or model instance are no longer valid. The lack of provenance tracking also can result in incorrect assessment of risk, as a set of successive analytic models is employed, each based on results of an instance of a previous model where input data and assumptions are unknown or not fully understood.