Extensible Markup Language (XML) is a standardized text format that can be used for transmitting structured data to web applications. XML offers significant advantages over Hypertext Markup Language (HTML) in the transmission of structured data.
In general, XML differs from HTML in at least three different ways. First, in contrast to HTML, users of XML may define additional tag and attribute names at will. Second, users of XML may nest document structures to any level of complexity. Third, optional descriptors of grammar may be added to XML to allow for the structural validation of documents. In general, XML is more powerful, is easier to implement and easier to understand.
However, XML is not backward-compatible with existing HTML documents, but documents conforming to the W3C HTML 3.2 specification can be easily converted to XML, as can documents conforming to ISO 8879 (SGML). Further, while XML allows for increased flexibility, documents created under XML do not provide a convenient mechanism for searching or retrieval of portions of the document. Where large numbers of XML documents are involved, considerable time may be consumed searching for small portions of documents.
For example, in a business environment, XML may be used to efficiently encode information from purchase orders (PO). However, where a search must later be performed that is based upon certain information elements within the PO, the entire document must be searched before the information elements may be located. Because of the importance of information processing, a need exists for a better method of searching XML documents.