1. Field of the Invention
This invention relates to XML processing, and in particular, it relates to a method and related apparatus for parsing XML files.
2. Description of Related Art
XML (Extensible Markup Language) is general-purpose markup language widely used to facilitate the sharing of data across different information systems, particularly systems connected via a network such as the Internet. There are a number of well-known XML processing software libraries available to software developers. The two most widely used algorithms for parsing XML files are DOM (Document Object Model) and SAX (Simple API for XML). In a DOM-style parse, the parser module breaks an XML document into a tree data structure. Each node of the tree corresponds to a structure element of the XML file. For extremely large XML files, a DOM parse is a problem due to the large amount of memory required to store the document tree data structure. A DOM style parse of such a large file could result in the application attempting to allocate physical memory and result in an out-of-memory condition. For such large files, a SAX-style parse would be preferred. A SAX parse is event-driven and takes a piecemeal approach to processing an XML document. In a SAX parse, an application (such as an XML to PostScript® (PS) converter program or other programs that utilize XML files) implements a set of pre-defined callback functions that are invoked by a SAX parser, which is a separate module. When an instance of the SAX parser is created, a pointer to the callback functions is passed to the parser. The SAX parser then reads through the XML document from start to finish and invokes callback functions for XML structural elements that it encounters.