1. Field of the Invention
The present invention relates to systems and methods for mapping data in one or more data sources having source data schemas to at least one data target having a target data schema.
2. Description of the Related Art
Many modern applications such as data warehousing, global information systems, and electronic commerce require accessing a data source that stores data arranged in a source schema, and then using that data at a target which requires the data to be arranged in a target data schema. As but one example, product data that is stored in one schema for optimal storage efficiency might have to be accessed and reformatted into another schema for Web commerce, often in real time.
Thus, mappings between a source schema and a target schema are required. Creating such mappings currently is a largely manual and difficult process, which is accomplished using complex programs that are handwritten or pieced together by special tools and that must be carefully tuned to optimize performance. Particularly in the context of e-commerce, this is unacceptable, because e-commerce applications evolve very quickly and often require direct access to source data in real time. With this in mind, the present invention recognizes that it is desirable to facilitate creation of a source-to-target mapping by a user who might not be an expert in schema mapping.
The present invention critically observes that in the above-mentioned applications, particularly in e-commerce, it is not necessary to transform an entire source database into a target schema to satisfy a single request. Moreover, the present invention critically observes that in the above-mentioned applications, both data transformations and schema transformations might be required. In conventional integration paradigms, these are viewed as separate endeavors. One consequence is that conventional integration paradigms do not make use of data in evaluating schema correspondences. In the context of the above-mentioned applications, however, the present invention recognizes that data advantageously can be used to evaluate various schema mappings.
The present invention also recognizes that relational database management systems (RDBMS) are often used as data sources and as targets. With this recognition in mind, in light of the observations above regarding the need to quickly and with simplicity generate source-to-target mappings, the present invention understands that a DBMS can also be used to create mappings by using SQL queries, even when a user is not an expert in SQL. As set forth below, this is done by guiding the user over a space of potentially many competing join queries and then selecting only a subset of the join queries as the desired source-to-target mapping.