Dealing with data distributed over multiple databases has long been a problem. As accessing the data from multiple databases has become more of a necessity, people have increasingly turned to data warehousing. Data warehousing provides a method in which the data from various databases could be replicated in a manner that allowed for the data to be treated as belonging to a single database. Further, if one had the resources, the new replicated structure was not required to be a relational database, but instead could be implemented as any kind of structure desired.
Data warehousing provided for a simpler and faster method for extracting results. As new data warehouse techniques and software became available, warehouses were designed in a manner that best fit the type of data requests that were expected. By abandoning the constraints of the relational database, far more efficient designs could be implemented. However, data warehousing is a very costly endeavor, and each reorganization of the data required complete replication—putting limits on how flexible the data warehouse could be. Unfortunately, for many potential users, the cost of maintaining something the size of a data warehouse was prohibitive. Data warehouses were, and still are, a major undertaking.
In recent years a new technology, Virtual Data Warehousing, has attempted to allow multiple databases to be interrogated together without the need of constructing and maintaining a massive data warehouse. Instead of replicating all the databases within a single massive database, a virtual data warehouse provides a common access to all the databases, making the collection of independent databases appear as a data warehouse to the user. This greatly reduces infrastructure and maintenance costs; and the data does not have to be replicated. But, virtual data warehousing suffers from terribly poor response times, and major drains on the overall system that cannot be adequately anticipated due to unscheduled massive processing within the member databases.
An additional problem that has arisen involves dealing with many users attempting to access data from many sources. In this scenario there is a problem properly restricting access to certain data while granting access to other data. Conventional data warehouses require very specific software to be written to address this problem. Out of the box, however, there is no simple solution. Conventional virtual data warehouses rely on the security models of the member databases. But, these models break down as soon as any performance upgrades such as caching or indexing of data is done external to the member databases.