With advances in contemporary business information systems, all levels of an organization can now enjoy access to repositories of business data known as data warehouses. Data warehousing techniques enable businesses to eliminate extensive amounts of unnecessary workload generated by multiple redundant reporting tasks, and can further facilitate the standardization of data throughout an organization. Business planning applications such as budgeting and forecasting systems are increasingly being integrated into advanced data warehousing solutions in order to maximize returns on what has often been considerable investments in both computing facilities and the gatherings of data they contain.
A data warehouse contains collections of related data known as datasets. When these datasets are relatively small, such as when a data warehouse has been recently implemented, users can easily access and work with complete datasets directly on their personal computer systems. However, difficulties arise when datasets get larger. Datasets can eventually grow within a data warehouse facility to contain billions upon billions of individual data values, many times larger than can be handled by the computational capacity of any single user's computer system.
In order to provide a workable solution for handling these very large datasets, prior art methods have been employed to extract and deliver subsets of these larger datasets to designated users. This has required close management of the size of each data subset to ensure that users receiving these data subsets can consistently access them given the computational limitations of their individual computer systems, limitations such as calculation size limits, fixed memory limitations, and other hard limits. Upon completion of user interaction in these prior art methods, all data subsets must be returned to their “superior” datasets within the data warehouse through a process known as consolidation.
The problem with these prior art methods has been that they employ manual techniques or scripts that must be manually run and maintained in order to extract the data subsets. The consolidation process has also been a mostly manual process of running database-specific scripts. In addition, the administrator responsible for creating and executing the extraction scripts must also keep track of what data has been delivered to which user.
The result has been that prior art data warehouse extraction and consolidation methods are highly time-consuming to define, execute and maintain for very large datasets. Furthermore, the delivery of data subsets to designated users lacks integrated tracking, and is often independent of, and therefore outside the control of the organizational security structure employed by the querying application. Therefore, what is needed is a more manageable data model for supporting very large datasets.
For the foregoing reasons, there is a need for an improved modelling system and method for handling data queries that generate very large datasets.