Modern computer systems process information in a variety of different data formats. Some data formats are markup language formats such as the Hypertext Markup Language (HTML) and the extensible markup language (XML). Such markup language data formats are text-based data formats. HTML is a markup language used for the representation of Web pages. XML is a widely adopted data encoding format and specification developed by the World Wide Wed Consortium (W3C). XML is a pared-down version of Standard Generalized Mark-Up Language (SGML), designed especially for creation and representation of Web documents. XML files, often referred to as documents, provide a text-based encoding format that enables a human to view the file and obtain an understanding of its contents. XML is also similar to the HTML that is used for the representation of Web pages since both use markup codes known as tags to identify specific data and attributes of that data. An XML document consists mainly of text and tags, and the tags imply a hierarchical tree structure upon the data contained in the XML document.
When computer systems process markup data formats such as XML for example, processing within such computer systems often converts text strings appearing within the data formats into numeric identifiers or codes that allow the computer system to perform more efficient processing on the data formats. As an example, in conventional XML processing software that operates in conventional computer systems, such software converts unique portions of XML text data such as XML tags or uniform resource identifiers (e.g., URLs) into uniquely encoded numbers sometimes referred to as QNAMES. QNAMEs are thus a unique numerical representation of a character string used to improve XML processing performance. To generate a QNAME, conventional software programs applying a hashing function or other processing to the unique text data to generate a unique numeric equivalent.
Once the conventional XML processing software has converted unique text strings in the markup language format into equivalent respective QNAMES, such software can then perform processing on the QNAMES rather than on the actual tag (i.e. text) corresponding to the QNAME. As noted above, one purpose for conversion of text strings to QNAMES is that computer systems are more efficient at processing numeric values rather than text data. As an example, the markup language processing software can perform operations such as comparisons on the unique numeric identifiers (i.e. QNAMES) in a more efficient manner that if applying equivalent processing to the text data associated with those numeric identifiers. As a result, processing of the markup language data is faster.