In response to the widespread use of the XML format for document representation and message exchange, major database vendors support XML in terms of persistence, querying and indexing. With the new option of storing and querying XML in a relational DBMS, data architects face the decision of what portion of their data to persist as XML and what portion as relational data. XML documents and messages pervade enterprise systems such that XML formats have been standardized for data storage and exchange in many industries. While much critical data are still in relational format, practitioners have increasingly turned to XML for storing data that do not fit into the relational model.
In the health care industry, for example, XML is widely used for sharing the metadata of medical records in backend repositories. In one real-world scenario, the schema for the metadata contains over 200 variations in order to support the diverse types of medical documents being persisted and queried. These 200 variant types have a shared common section and specific individual extensions. Persisting such metadata in relational format results in a large number of tables and poor performance. Moreover, adding a new type of document requires many hours of re-engineering the relational schema to accommodate the new type.
In another example, XML is heavily employed in trading systems for representing financial products such as options and derivatives. New types of derivatives are invented every week, which in turn triggers weekly data model changes and hence schema changes. Again, using a relational format requires lengthy database schema changes and data migration, which consequently affect the agility of the business. As those two real cases, enterprises in various industries have turned to XML for greater flexibility and easier maintenance, when compared to relational representation. On the other hand, much legacy data and transactional data remains highly rigid and well-suited for the relational model. It is clear that neither pure relational nor pure XML data management systems will suffice.
As can been seen, although XML is now a first-class citizen in the DBMS, data architects are still unsure of how exactly to persist their data. It is still not trivial to decide what portion of the enterprise data to persist as XML and what portion as relational data. This problem has not been addressed yet and represents a serious need in the industry.
Therefore a need exists to overcome the problems with the prior art as discussed above.