The exemplary embodiment relates to processing in n-ary trees. It finds particular application in connection with an apparatus and method for representing the structure of an XML document, allowing large tree structures to be stored using less memory than other approaches.
The Extensible Markup Language (XML) is a widely used extensible language which aids information systems in sharing structured data, encoding documents, and serializing data. XML provides a basic syntax for sharing information between different computers, different applications, and different organizations without needing to pass through many layers of conversion. XML documents are stored in the form of a tree where each of a set of nodes is connected directly or indirectly to a root node and each node can have at most one parent node. Data, such as lines of text, is associated with at least some of these nodes. In the case of, for example, books such as manuals, the tree structure can be very large.
It is often desirable to store large XML trees in memory for manipulation (e.g., swapping the position of two sibling nodes, adding nodes, or deleting nodes). If the representation of the tree structure is larger than the available physical memory, then only a portion of the tree can be loaded in memory. Should a user wish to work on a portion of the tree not currently in memory, then the user will have to wait while the portion of the tree not stored in memory is loaded into memory. Therefore, it would be advantageous to have as efficient a representation of the XML tree in memory as possible while still being able to manipulate the tree. Any efficiency in the representation of the XML tree in memory would allow a larger XML tree to be manipulated in memory for a given amount of memory. In addition, a more efficient representation of XML could take up less space in non-volatile storage, for example on a hard disk.