XML is an open standard general-purpose markup language used to encode documents to facilitate the sharing of data via the Internet. XML is similar to HTML used to create documents on the World Wide Web incorporating text, graphics, sound, video, and hyperlinks. However, XML differs from HTML by employing tags that indicate the logical structure instead of the display specifications of the coded data.
There are many different ways to represent XML data in memory structures. For example, XML data may be represented as (1) character sequences as recommended by the World Wide Web Consortium (W3C); (2) character sequences with secondary indexes to improve access speed; (3) a list of structures, each structure representing some aspect of the XML data or (4) trees of structures, each structure representing some aspect of the XML data.
A typical ‘list of structures’ representation of XML data uses structure types to represent different features of XML data and connects these structures into a list in document order. Typical structure types are elements, attributes, character data, namespaces, processing-instructions and comments.
The ‘list of structures’ representation of XML data has the advantage of being relatively compact compared to a ‘trees of structures’ representation and is also easier to automatically process than character sequences. Thus, it is a good compromise between slow character sequence processing and a ‘trees of structures’ representation which may require a large memory.
A search of an XML document may be logically specified using the hierarchy exposed by the XML data, for example, find the children of a node ‘b’ which is the child of a node ‘a’ is expressed in the World Wide Web Consortium (W3C) XML Path Language (XPath) syntax as “//a/b/*” in a way similar to the use of regular expressions with character sequences. However, the use of a hierarchical search model over data that is stored as a list of structures may be slow in comparison to searching data that is represented as a “tree of structures” where pointers between the structures may be used to limit the search space. Thus, a ‘tree of structures’ representation provides for random access to the XML data whereas a ‘list of structures’ only provides for sequential access.
Although the following Detailed Description will proceed with reference being made to illustrative embodiments of the claimed subject matter, many alternatives, modifications, and variations thereof will be apparent to those skilled in the art. Accordingly, it is intended that the claimed subject matter be viewed broadly, and be defined only as set forth in the accompanying claims.