Data processing tasks commonly occur in highly distributed environments and across diverse heterogeneous computing platforms and data processing applications. The growth in popularity of the Internet and the proliferation of web sites performing a wide variety of data processing tasks has only exacerbated this situation. Correspondingly, a great need to not only format data but to communicate that format to computing systems in a platform- or application-neutral manner quickly arose. To address this need, documents created and using the extensible markup language (XML) family of specifications became an unofficial standard for data communications across networked machines.
A great strength of XML is its use of tags to describe and structure data included in a document. The use of these descriptive tags makes XML a self-documenting data format. However, a common drawback associated with the use of tags is the challenge of converting data from an XML document into a data structure that efficiently stores the data in memory. Part of this challenge stems from the fact that a typical XML document is both verbose and redundant. A great deal of redundancy stems from the fact that well-formed XML code includes both opening and closing tags. Additionally, names of tagged elements can often be repeated as multiple instances of enclosing elements occur throughout an XML document.
One common approach to dealing with this drawback is to create a data structure in memory to store unique names of tagged XML elements. This data structure is typically called an XML name table, or simply a name table. Use of an XML name table can greatly speed processing and reduce computational overhead. Use of a global XML name table that stores unique names of tagged XML elements across more than one XML document can provide similar benefits.
A drawback of current name tables is their size. In computational environments where more than one XML document is used, the size of a traditional XML name table can grow prohibitively large because the global XML name table must track every unique name included in each and every XML document used by the computing system. Current systems lack efficient means to manage growth of a global XML name table. Additionally, contemporary systems lack effective ways to purge unused or infrequently used names from the global XML name table.