The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
Extensible Markup Language (XML) is a World Wide Web Consortium (W3C) standard for representing data. Many applications are designed to output data in the form of XML documents. XML data comprises structured data items that form a hierarchy. In XML, data items known as elements are delimited by an opening tag and a closing tag. An element may also comprise attributes, which are specified in the opening tag of the element. Text between the tags of an element may represent any sort of data value, such as a string, date, or integer. An element may have one or more children. The resulting hierarchical structure of XML-formatted data is discussed in terms akin to those used to discuss a family tree. For example, a sub-element is said to descend from its parent element or any element from which its parent descended. A parent element is said to be an ancestor element of any sub-element of itself or of one of its descendant elements. Collectively, an element along with its attributes and descendants, are referred to as a tree or a sub-tree.
With the rise and popularity of XML, many relational database systems have added support for storing, managing and querying XML content. The term relational database system refers to any database system that supports the relational model of data processing, including database systems that may support other models of data processing, such as object-relational and various models the XML standard (e.g. XQuery, XPath)
Relational database systems that store XML documents may store individual elements in separate rows of a table. Such documents are referred to herein as shredded documents. The process of dividing an XML document into discrete element values (or representations thereof) for storage in the rows that hold the data representing the XML document and/or nodes thereof is referred to as shredding the XML document.
A shredded version of a XML document may have much a larger storage footprint than that of the XML documents stored in other forms, such as text based file storage. There is a thus a need to store shredded XML documents more efficiently.