Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
A central data warehouse (CDW) generally serves as a central repository for informational data about a business enterprise. Typically, the source data for the central data warehouse come from various operational applications executing in the business enterprise such as enterprise resource planning (ERP) systems, customer relations management (CRM) systems, human resource (HR) systems, and so on. The central data warehouse is sometimes referred to by other terms such as “data warehouse”, “centralized” data warehouse, enterprise data warehouse, and so on.
An enterprise typically builds a central data warehouse to enable a consolidated view on the relevant key performance indicators (KPIs) such as sales volume, margin, profit, etc. Setting up such a central data warehouse requires company-wide efforts and means a significant investment for the company. Despite the complexity and cost, a central data warehouse nonetheless provides valuable information to the business enterprise by providing a total view of the company's performance and financial status using data collected from various sources within the enterprise. A central data warehouse is typically maintained and controlled by an information technology (IT) department which relies on clearly formulated requirements by the business.
The central data warehouse is suitable from the point of view of the enterprise as a whole. However, groups within the enterprise require flexibility in terms of being able to view the data in their own way, develop new data models, and conduct analyses in ways that are specific to their needs. The central data warehouse architecture is generally not so dynamic. Because of the centralized nature of the data, there are typically strictly controlled and limited accesses to the central data warehouse. Accordingly, the response time to the demands of new or changing requirements by the individual users or business departments is likely to be long. In addition, the sheer volume of data that may have to be processed can add to the delay. Also, due to an increasing number of legal constraints, such as auditing rules, data protection rules, world-wide financial regulations requiring centralized governance of the data, and so on, the resulting procedural delays (“red tape”) can further increase the response times.
A conventional solution is the use of local data marts. The term “data mart” is generally understood as comprising a partition of the total enterprise data that is stored and maintained in the central data warehouse. The data mart typically is created for a specific use by a group of users in the enterprise. For example, a sales group may only be interested in regional sales figures for their own planning purposes, and would not be interested in data relating to manufacturing. Accordingly, a data mart of regional sales figures may be instantiated for the sales group. Conventionally, the sales group might download a copy of just the regional sales data from the central data warehouse to create a local instance of the regional sales data in their data mart. Then they use this data to build a smaller solution that can be controlled and maintained only by them.
Having a local copy of the data in their own local data mart gives a business department within the enterprise the freedom to fulfill their requirements in the manner that they want, without the constraints imposed by the central data warehouse. This conventional approach, however, has several drawbacks:                Redundancy: It creates additional data redundancies in the organization as the data is replicated to the local instances.        Security: The person who downloaded the data may see more or different data than other users of the local solution. These restrictions have to be re-implemented on the local instance or, as it is often the case, are simply neglected.        The “one version of truth” approach is violated. The downloads represent snapshots of the data at any time, reporting on different such local solutions can differ significantly and cause bad decisions and/or endless discussions on which dataset is the “right” one.        Higher total cost of ownerships for the company. These local solutions also require technically skilled people to manage them and the departments start to build up “shadow” IT organizations instead of relying on the central services of the IT organization.        