The present invention relates to data processing by digital computer, and more particularly to mapping elements between disparate schemas.
Integration of applications in an enterprise can lead to more efficient operations. Enterprise application integration can require significant effort when migrating from disparate legacy applications to a more integrated framework. Enterprise application integration can be performed using a message exchange procedure, in which messages are exchanged between different data sets. Application data is typically organized according to the type of application or applications with which the data is designed to operate. As a result, the organization or structure of the data can be highly specialized. The messages used for enterprise application integration are generally structured sets of data in a well-defined syntax. The structure of the data can be referred to as its schema. Countless different schemas and/or schema domains (e.g., SQL DDL, XML-based dialects (such as xCBL), OWL, RDF, ODMG, SAP-IDoc, EDI, UBL, etc.) exist. Many different integration scenarios (e.g., business process integration, enterprise application integration, and master data management) require schema matching, in which a mapping between the elements of two schemas is produced. Schema matching can also be important in data translation applications (e.g., where data from a first database is migrated into a second database for use with a different application).
Existing techniques for schema matching primarily rely upon manual mapping of elements from one schema to another. Some approaches exist, however, for partially automating the schema matching process using simple algorithms for field name or database structure matching or using machine learning technologies. Some approaches combine the criteria of different matching algorithms to produce a more complex matching technique (i.e., hybrid and composite matchers). Simple, hybrid, and composite matchers, however, are inflexible and tend to produce good results for some types of schemas while producing poor results for other types of schemas.
Techniques have also been proposed for building ontologies for different schema domains. By building an ontology, schemas can be classified by type, and different weights can be applied to different individual matchers based on the class or classes of the schemas to be matched. For example, schemas in a first classification may use a composite matcher that heavily weights the contribution of a field name matcher that is a component of the composite matcher, while schemas in a second classification may use a composite matcher that heavily weights the contribution of a structural matcher that is a component of the composite matcher. Such an approach may provide improved performance relative to conventional simple, hybrid, or composite matchers but only works for schema domains that have previously been associated with a particular class of schema domains.