(Large) enterprises manage a rapidly increasing volume of information for optimizing production and distribution processes, for evaluating compliance and customer satisfaction, or for managing staff-related data. Often, many different systems are used for managing data, and the data is distributed over many different distributed sources. Often, however, a “global”, “consolidated” or “holistic” view on the available data is necessary.
According to some prior art approaches, the data of many different data sources is copied and stored into a single (virtual/logical) data warehouse for easy, centralized access. Typically, Extract, Transform, Load (ETL) tools are used for extracting the data from the many data sources, transforming the extracted data into a common data format, and for loading the formatted data into a central database management system. Building a Data Warehouses thus typically requires the definition of a common data model and is a complex endeavor. However, using a Data Warehouse for providing a central, consolidated data access has many draw backs: the data needs to be replicated to the data warehouse. Typically, the data is replicated over a network, e.g. the internet or an intranet. Thus, any changes in the data sources are replicated to the data warehouse with some delay. This may result in inconsistencies and the acquisition of advanced (and often expensive) data warehouse management technologies may be necessary. Moreover, the data transfer from the source system to the data warehouse generates a significant amount of network traffic and consumes processing power.
Alternatively, federated database systems are used for providing a single, consolidated view of the available data. A federated database system is a type of meta-database management system (DBMS), which transparently maps multiple autonomous database systems into a single federated database. The constituent databases are interconnected via a computer network and may be geographically decentralized. A federated database, or virtual database, is a composite of all constituent databases in a federated database system. In contrast to the data warehouse approach, the data of the data sources is not copied into a central repository. Thus, there is no actual data integration in the constituent disparate databases as a result of data federation. Rather, federated database systems provide a uniform user interface through mapping and abstraction of data structures, thereby enabling users and clients to store and retrieve data from multiple noncontiguous databases with a single query. To this end, a federated database system must be able to decompose the query into sub-queries for submission to the relevant constituent DBMSs, after which the system must composite the result sets of the sub-queries. Because various database management systems employ different query languages, federated database systems require wrappers to translate the sub-queries into the appropriate query languages. They are very sensitive to structural changes in the source databases and thus are often considered as inflexible and costly to maintain. For example, in case the structure of some data tables in a source database is changed or in case a DBMS with a different SQL dialogue is used as a new data source, also the mapping that generates the abstract layer needs to be changed. Moreover, the organization of the data in some or all of the source databases may not be suited for efficient query execution if the query is an analytical query covering many different data sources.