The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
Extensible Markup Language (“XML”) describes and provides structure to a body of data, such as a file or data packet. The XML standard provides for tags that delimit sections of XML data referred to as XML elements. HTML is a form of XML.
An XML element may contain various types of data, including attributes and other elements. XML documents are typically quite verbose in that they can contain a large number of repeated start tags, end tags, and whitespaces. Although the XML text format is designed for readability, it was not designed for efficient data storage or data transmission.
To address this, binary XML is one format in which XML data can be stored in a database or XML repository (XR). Binary XML is a compact binary representation of XML that was designed to reduce the size of XML documents. When stored in binary format, an XML document consumes much less space than is required by other formats of XML storage. However, this space savings is achieved at the cost of additional processing overhead required to convert textual XML to binary XML, and to convert binary XML back into textual XML.
Although reference is sometimes made to a single “binary XML”, XML data may be stored in multiple, proprietary binary formats. One of these formats represents strings (“tokens”) with fixed values. In this implementation of binary XML, a mapping is established between tokens and replacement values, where the tokens are tag names, and the replacement values are numbers. Such mappings for a set of XML data, such as an XML document, are referred to herein as a “token vocabulary.” Once a token vocabulary has been created, XML documents may be stored in binary XML based on the token vocabulary. In typical implementations of binary XML, even symbols such as “<”, “>”, and “/” can be represented by binary replacement values.
The number of businesses exchanging information electronically is proliferating. Businesses that exchange information have recognized the need for a common standard for representing data. XML is rapidly becoming that common standard. As stated, XML data is sometimes stored in an XML repository. However, if the burden of encoding the data into XML is imposed on the XML repository, valuable CPU resources must be dedicated to this task. Consequently, an improved mechanism for loading XML data into an XML repository is desired.