Vast amounts of information are being stored in data sources in accordance with different storage models (e.g., relational, object) thereby causing a continuing problem in data translation between the data sources. This presents a major problem to companies and individuals in that the data existing in one model (or schema) needs to be accessed via a different model (or schema). Such an example can be found in data warehousing where data is received from many different sources for storage and quick access from other sources. Converting data from one model (or schema) to another model is not only time-consuming and resource intensive, but can be fraught with conversion problems.
Data mapping is a technique utilized for creating a data transformation between disparate data sources (or schemas). More specifically, a data mapping is a relationship between the instances of two different schemas. In one example, consider three data schemas—schema one, schema two, and schema three—where a first data mapping is derived that maps data between schema one and schema two, and a second data mapping that maps data between schema two and schema three. The goal is to compose the first and second data mappings into a third mapping that captures the same relationship between schema one and schema three as the corresponding mappings—mapping one and mapping two.
It can be appreciated that mapping composition is at the core of many important data management problems and occurs in many practical settings, including data integration, database design, peer-to-peer data management, schema evolution where a first schema evolves to become an updated schema, and the merging of two different schemas into a single schema, for example. Some common types of mappings include relational queries, relational view definitions, global-and-local-as-view (GLAV) assertions, XQuery queries, and XSL (extensible stylesheet language) transformations. Hence, general-purpose algorithms for manipulating mappings have broad application to data management. Conventional technologies lack a general-purpose reusable composition architecture for use in many application settings making data management between data models a continuing problem.