Practical application in business has been found for numerical modeling techniques in permitting businesses to make decision and take actions that can enhance revenue, market share, and other desirable objectives. Construction of detailed business models has been made more feasible by the availability of more detailed empirical observations from sources such as, for example, point of sale devices that can capture detailed information not previously available. There exists a need, therefore, for new approaches to analyze this available information in order to make better informed decisions. Innovations as described in this application have been found to render such modeling efforts comparatively more effective than models constructed without benefit of these teachings, as has been reflected in commercial success these innovate techniques have enjoyed in comparison with other less effective techniques against which they compete. The innovations described herein are being used to enable businesses to increase profits, increase sales volume, increase market share, improve risk profiles, manage strategic goals, and forecast more accurately quantities such as profit and origination volume.
Parameter estimation is conceptually simple. Data are observed empirically in the realm to be modeled for variates believed to be correlated. A form of function is postulated, the function being characterized by a set of parameters. Parameter estimation is a process for calculating, determining, or otherwise estimating those parameters characterizing that correlation function so as to minimize in some sense the difference between values predicted by the correlation function and values observed empirically.
Although parameter estimation is conceptually simple, many practical factors make the process more complicated. Many theoretical solutions to the problem of parameter estimation are known for those circumstances where data are well-behaved in adhering to conventional parametric statistics. But in the real world, various circumstances often encountered in empirically observed data often limit the effectiveness of these theoretically sound processes. Practical factors complicating parameter estimation include: lack of information, co-linearity, heteroskedacity, over-dispersion, bad data, and serial correlation.
One factor which may complicate parameter estimation in practice is a lack of useful information. Where, for example, the empirical data describes, for example, sales history, the price of an item may have never changed leading to no information about the price elasticity. In standard regression analysis this would produce an indeterminate matrix. This is a consequence of the fact that this empirical data is a record of observations, not the result of experimental design. Circumstances in general over the period of time observed may never have presented variations for which in theory one would prefer to have empirical observations.
Another factor which may complicate parameter estimation is co-linearity in factors. Factors here refers to those elements of the empirically observed data that are considered to be inputs in the correlation function. For example, it may be desired to model sales rate, volume, or quantity as a function of factors including item price and promotional efforts. In the empirically observed data, it may occur that the price changes at essentially the same time that the item is put on promotion. A practical problem in conventional parameter estimation is how to know how much of any change in sales is attributable to the price drop and how much is attributable to the promotional efforts. Such colinearity leads to a similar problem in conventional analysis as would the lack of information discussed above.
Another factor which may complicate parameter estimation is heteroskedacity. The term heteroskedacity denotes the effect that errors in observations may change with each measurement. For example, in sales data the error estimate associated with a promotional event may be much higher than the error estimate associated with a regular sales event. In general, the error estimates for observations at key points in the data set for which it is desired to construct an attribute model may not be the same as the error estimates for observations in other parts of the data set.
Another factor which may complicate parameter estimation is over-dispersion in the distribution characterizing data empirically observed. It has been found in practice that many naturally occurring distributions do not follow simple and conventional known distributions such as, for example, a Gaussian distribution or Poisson distributions. Instead, distributions observed in practice may be over-disperse or may present wider tails than would be expected with simple distributions.
Another factor which may complicate parameter estimation is bad or incomplete data. For example, where empirically observed data models events such as product sales, a strong spike in sales can have a strong influence on the parameter estimates Such a strong spike may occur in the empirically observed data at essentially the same time as some change in the attribute being modeled, but that similarity in time may be entirely coincidental. The spike in sales could instead have been cased by some other factor for which the model has no visibility. Such coincidental variation can clearly cause a problem in the model. In practice, one technique used to try to minimize the effects of such bad or incomplete data is to filter outliers before modeling. However, indiscriminate filtering of outliers can cause valuable information to be wrongly classified as outliers and discarded. Those outliers may correspond to points in the model in which one is most interested. There is a need, therefore, for a more intelligent and principled way to identify outliers less arbitrarily.
Another important factor which may complicate parameter estimation is serial correlation. Data empirically observed may be a function of time, reflecting trends, seasonality, and periodicity. In general, data observed may be in part a function of previously observed data. What happened yesterday may a good indicator of what will happen today.
Parameter estimation, while conceptually simple, thus becomes a very challenging problem due to these practical issues. While there is commercial software available to estimate model parameters, products now available in general do not sufficiently productize this process. Existing approaches using known, commercially available software typically require a sophisticated analysis from a highly qualified individual analyst (typically a person having a doctorate in a relevant field) to “build the model”. Building the model is a process where the analyst reviews and tweaks the model based on intuition and an understanding of potential parameter estimation problems. Building the model is not a scalable process for large enterprises that are required to manage millions of models. For example, a retailer may have hundreds of thousands of products that are sold in thousands of stores throughout the world, corresponding to hundreds of millions of individual product models which must be managed.
In many attribute models as currently practiced a single parameter is employed to model the equation. That parameters is usually postulated to adhere to some parametric distribution, such as a Gaussian, Poisson or Student T distribution. There exists a need therefore for a modeling framework for analyzing business problems and the like in which the parameters are not limited to one dimension. There further exists a need for a modeling framework for analyzing business problems and the like in which the model is not restricted to a linear form or the assumption of a particular parametric distribution.
Recently, there has been a strong interest in “Bayesian-Shrinkage”. Bayesian shrinkage is a statistical technique that essentially shrinks or pulls a parameter value towards its average. Bayesian-Shrinkage helps to reduce the problems associated with lack of information and co-linearity when one has a good external estimate for a parameter. However, it is frequently the case that one does not have a good external estimate for a parameter. Furthermore, “Bayesian-Shrinkage” does not incorporate knowledge concerning (or otherwise take advantage of) relationships between parameters. Often, the analyst or other individual seeking to model empirically observed data may know or suspect some functional relationship between or among parameters. This expectation of a functional relationship may be based on factors such as intuition, experience, or knowledge of physical laws showing that there is some functional relationship between parameters. Current applications provide no way of providing this information to the parameter estimation process. Also, Bayesian-Shrinkage does not help with bad data that tend to throw parameters to unrealistically extreme values. There exists a need, therefore, for an attribute modeler better able to overcome these problems that exist in known approaches to the attribute modeling.
The term outliers denotes unlikely events that can bias parameter estimates. Outliers are anomalies that need to be removed from the modeling data set. Typically in systems in use today outliers are flagged based on a Gaussian distribution. If the data point is more than N standard deviations from the mean it is flagged as an outlier and eliminated from the modeling data set. Unfortunately, this simple process may eliminate much of the important data for certain applications, such as modeling sales data. Using this criteria of deviation from a Gaussian distribution, most promotional events would be classified as outliers and eliminated. There exists a need, therefore, for a better way to identify outliers in an attribute modeler.