Modeling is commonly used to forecast or predict behavior or outcomes. These models may be generated through a regression analysis or other method of analyzing historic data. For example, companies use historic sales data to generate models that predict how sales will be impacted in the future, and these companies may make adjustments to improve sales or control product inventory accordingly.
There are many conventional techniques to evaluate the accuracy of the output, e.g., sales predictions, of these models. However, once a model is determined to be inaccurate, it is very difficult to improve the accuracy of the model if there is a problem with the input data used to generate the model. Poor model performance may be the result of insufficient data for certain model input parameters from certain data collection sources, or due to inconsistent calculations performed by different sources when determining the parameters. It may take many man hours to analyze each of the input parameters to identify which input parameters are causing the inaccuracies of the model predictions. Furthermore, the analysis may be further complicated by the fact there is no objective measure for evaluating the quality of the input parameters and for estimating the impact of different data quality aspects on the quality of the final model. In addition, it is costly for companies to collect the historic data and to build the models. Often, the collected data is not initially analyzed to determine whether the data can be used to build accurate models. As a result, time and money are wasted by building inaccurate models.