Every day more and more data are being aggregated by companies with the hope that useful information can be extracted from the data. The data may be collected directly by the companies themselves or by third-party vendors who sell the data to the companies. For example, a company may be interested in evaluating its sales performance of a particular brand relative to its competitors in various market regions. To start, the company may collect data on its own sales of the particular brand for the various regions and buy data of the same regarding its competitors' sales from a third party. Having gathered the data, the company may enter the data into one or more models to try to extract information from the data. The details of the process, however, are usually not straightforward, and companies are usually faced with a number of problems.
A first problem that companies usually face is having to manage a significant number of different models and their revisions over time. For example, a company generally has numerous spreadsheet analysis models that enable it to evaluate different aspects of its business using mathematical functions and other capabilities supported by the spreadsheet. These models, however, are hard to institutionalize due to lack of version control. Also, integrating the spreadsheet models into software programs already in use by the company would require extensive investments in both time and money. As a result, companies are generally unable to integrate decision-support models into existing software without spending thousands, if not millions, of dollars in developing cumbersome programs.
A second problem that companies usually face is working with many different data sources and/or data formats. When incoming data are not standardized in any one uniform structure, companies are often unable to load and integrate data from disparate sources. As a result, companies may spend thousands of dollars attempting to modify their underlying data model, change SQL/ETL and other tools used for data processing, and troubleshoot issues with data quality. In some cases, a company may try to modify and standardize the structure of the incoming data sources. However, such an endeavor is hard to achieve because the data providers are often third parties who do not have the same incentives as the company.
In view of the foregoing, there exists a need for a system and method for managing data and data models that overcome these problems.