Multiple data models, and the consequent databases, allow business processes to be automated through both custom-built applications and commercial off-the-shelf software package-built applications. Each data model rests upon its own domain of attributes, defined by data schemas. Often, the same business entities exist concurrently in several data schemas, with a combination of database schemas for relational databases and non-relational databases, such as Cubes, Reports, Dashboards, and Scorecards. Often, the attributes defined by the data schemas are differently named, data-typed and constraint-typed. This leads to the multiplicity of definitions of business entities, which creates problems in data integration endeavors, particular in those directly concerned with information access and analysis.
Two approaches to the problem of data integration include a federated database approach and a data warehousing approach. The federated database approach brings attributes from different data schemas together within a single context or catalog. However, there are two drawbacks. Although the federated database approach accomplishes the structural integration of data, it fails in the functional integration of data. In a federated database, entities are individually cataloged. However, the federated database fails to reconstruct the conceptual entities. For example, assume that a business entity named Orders refers to a family of entities, where an Order has many Items and an Item has many Ship-to destinations. This family of entities would have at least three entities as a consequence of data decomposition under the federated database approach. However, the entity Orders is not reconstructed as a single conceptual entity with the child entities Item and Ship-to. Further, the federated approach does not deal with the metadata of non-relational data schemas.
The data warehousing approach makes a copy of related entities/tables and transforms them into a single entity/table. For example, the entities/tables Customers and Customer Types are placed within a single Customer dimension table by means of denormalization. However, such transformation cannot be accomplished with transaction tables such as Orders and Payments.
Furthermore, the known approaches require that users have a perfect knowledge of the underlying database structures in order to access the data. This requirement is impractical for business users to learn the intricacies of the databases. Thus, data integration projects require architects to acquire perfect knowledge of databases involved, which is a costly, time consuming, and impractical process.