When eXtensible Mark-up Language (XML) is stored in a relational table, it becomes a column type. How to store the XML data to achieve scalability is a challenging problem to solve. The conventional approach is to store the XML data within the base table as a VARCHAR (variable-length character) or CLOB (Character Large Object) column. The textual XML format stored in these column types does not work well for both XML queries and update, since textual XML needs to be parsed every time, an expensive operation during querying, and only whole document replacement can be used for update in general.
Another conventional stored XML format is based on object relational data model by decomposing the XML into a relational or object model, and store nodes and edges as rows or objects. Due to the large number of entries generated for documents and joins needed for XML queries, this approach is not scalable. In addition, there may be difficulties in supporting proper concurrency control with the decomposed data, such as locking a subtree.
Yet another approach to store XML data is storing XML-specific binary data, such as token stream or hierarchical data model, into a BLOB (Binary Large Object). This format can speed up the queries, but not update due to LOB model restriction. Unfortunately, extending LOB operation model to support flexible partial update without affecting references in indexes is not easy since XML indexes will reference the position in the LOBs.
Accordingly, there exists a need for a scalable storage scheme for native XML column data of relational tables. The storage scheme should store XML data such that scalability is increased and queries are more efficient. The present invention addresses such a need.