1. Field of the Invention
The present invention relates generally to systems and methods for data integration and management, and more particularly for integrating and accessing multiple data sources within a data warehouse architecture through such techniques as automatic generation of mediators which accept data in a specific format, perform transformations on and store the data.
2. Discussion of Background Art
Data warehousing is an approach for managing data from multiple sources by representing a single, consistent view it. One of the more typical data warehouse architectures, the mediated data warehouse, uses a series of data source specific wrapper and mediator layers to integrate the data into the consistent format required by the warehouse. Commercial data warehousing products have been produced by companies such as RebBrick, IBM, Brio, Andyne, Ardent, NCR, Information Advantage, Informatica, and others. Furthermore, some companies use relational databases, such as those sold by Oracle, IBM, Informix and Sybase, to develop their own in-house data warehousing solution.
These approaches are successful when applied to traditional business data because the data format used by the individual data sources tends to be rather static. Therefore, once a data source has been integrated into a data warehouse, there is relatively little work required to maintain that connection. However, that is not the case for all data sources. Some data sources, in particular within certain domains, tend to regularly change their data model, format and/or interface. This is problematic because each change requires the warehouse administrator to update the wrapper, mediator, and warehouse to properly read, interpret, and represent the new format. Because these updates can be difficult and time consuming, the regularity of data source format changes effectively limits the number of sources that can be integrated into a single data warehouse.
In order to increase the number of dynamic data sources that can be integrated into a warehouse, the cost of maintaining the warehouse must be decreased. This could be accomplished by some combination of reducing the cost to maintain the wrapper, the mediator, and the warehouse data store.
In response to the concerns discussed above, what is needed is a system and method for reducing the cost of data warehouses that integrate and provide access to multiple data sources, overcoming the problems of the prior art.