In order to exchange, migrate and integrate XML data, documents of a source XML (DTD) schema must be mapped to documents of a target schema. XML mappings can be defined, for example, in a query language, such as XQuery or XSLT, but such queries may be large and complex, and in practice it is desirable that XML mappings (1) guarantee type-safety and (2) preserve information.
The document produced by an XML mapping should conform to a target schema, guaranteeing type safety. This may be difficult to verify, however, for mappings defined in XQuery or XSLT. See, e.g., N. Alone et al., “XML with Data Values: Typechecking Revisited,” Principles of Database Systems (PODS) (2001). Further, since in many applications one does not want to lose the original information of the source data, a mapping should also preserve information. Criteria for information preservation include: (1) invertibility (i.e., whether the source document can be recovered from the target document); and (2) query preservation (i.e., for a particular XML query language, whether all queries on source documents in that language be answered on target documents).
While a number of techniques have been proposed for information preservation for traditional database transformations, a need still exists for methods and apparatus for mapping XML documents. A number of tools and models have been proposed for finding XML mappings at the schema or instance-level, but such tools and models have not addressed invertibility and query preservation for XML. A need therefore exists for methods and apparatus for mapping source documents to target documents that ensure type-safety or preservation of information (or both). A further need exists for efficient methods and apparatus for finding information-preserving XML mappings.