The present invention relates in general to database applications and, more particularly, to consolidation of multiple source content schemas into a single target content schema.
In electronic (xe2x80x9ce-commercexe2x80x9d) database applications, content such as product-related information may be organized according to schemas. Generally, a schema is an organizational structure for data. In a relational database, a schema may define the tables residing in the database, the rows and fields in each table, and the relationships between the rows and fields in different tables. In e-commerce database applications, a schema may include a set of product classes (which can be referred to as a xe2x80x9ctaxonomyxe2x80x9d) organized in a hierarchy, with each class being associated with a set of product features, characteristics, or other product attributes (which can be referred to as a xe2x80x9cproduct ontologyxe2x80x9d). For example, writing pens can have different kinds of tips (e.g., ball point or felt tip, different tip sizes (e.g., fine, medium or broad), and different ink colors (e.g., blue, black or red). Accordingly, a schema can include a class corresponding to pens, which has a product ontology including tip type, tip size and color, or other appropriate attributes. Within a class, products may be defined by product attribute values (e.g., ball point, medium tip, blue ink). Product attribute values can include numbers, letters, figures, characters, symbols, or other suitable information for describing a product.
Existing database techniques for organizing and normalizing content (e.g., product-related information) use different schemas for different industry verticals, commodity domains, or other product classification structures. Consequently, if a user (e.g., customer or supplier) requires or uses content that can be classified in more than one product classification structure, the different schemas required for the content are maintained in separate databases. Although the resources expended in the past for maintaining different schemas in different databases may have been relatively low, given the ever-increasing number of e-commerce transactions being conducted, and the ever-increasing number of product searches being performed, resources required for maintaining separate such databases for use in evolving e-commerce applications continue to increase.
According to the present invention, disadvantages and problems associated with previous database applications may be reduced or eliminated.
In one aspect of the present invention, a computer-implemented method for mapping multiple source content schemas into a single target content schema includes receiving user input specifying one or more source schemas for mapping to a target schema. Each source schema is associated with a corresponding source database that stores content according to the source schema, each source schema having one or more source classes that each have one or more source properties. The target schema is associated with a corresponding database that can receive content extracted from one or more source databases and store the extracted content according to the target schema, the target schema having one or more target classes that each have one or more target properties. The method further includes, for each source schema to be mapped to the target schema, providing a source class tree representing the source schema classes and a target class tree representing the target schema classes for display to a user; receiving user input specifying one or more source properties within one or more source classes of the source schema, determining one or more target properties within one or more target classes of the target schema for mapping to the one or more source properties of the source schema, and generating a schema map file comprising a mapping of the one or more target properties of the target schema classes to the one or more source properties of the source schema classes. The method further includes accessing the schema map file for each source schema mapped to the target schema and, according to the one or more accessed schema map files, determining the one or more mapped source properties for which content is to be extracted from the one or more corresponding source databases. The method further includes applying the one or more schema map files to generate information used for extracting the content associated with the one or more mapped source properties and information used for loading the extracted content into the target database, extracting the content associated with the one or more mapped source properties from the one or more corresponding source databases, and loading the extracted content associated with the one or more mapped source properties into the target database according to the information for loading the extracted content.
Particular embodiments of the present invention may provide one or more technical advantages. For example, particular embodiments of the invention may enable a user to readily consolidate multiple source schemas associated with multiple corresponding source databases into a single target schema for a single corresponding target database. As such, particular embodiments of the invention may allow any enterprise to create one or more customized schemas from multiple domain-specific schemas, for example, schemas specific to particular industry verticals, commodity domains, or other product classification structures. In addition, as a result of the consolidation of schemas for multiple source databases into a schema for a single target database, resources previously required for maintenance of multiple databases may be substantially reduced. Particular embodiments of the invention may provide some, all, or none of these advantages. One or more other technical advantages may be readily apparent to one skilled in the art from the figures, description, and claims included herein.