1. Technical Field
The present invention relates generally to XML processing, and more particularly to systems, methods and computer program products for improving parallel processing of XML documents.
2. Discussion of Related Art
The eXtensible Markup Language (XML) is widely used in web services, messaging systems, databases, and document processing. The processing of XML documents is often a performance bottleneck in computer systems and applications, particularly if the XML documents are large (e.g., file sizes greater than one gigabyte). Many systems designed primarily for handling relational data have difficulty processing such large XML documents, leading to scalability problems, which can be alleviated to some degree if parallel processing is enabled. Moreover, with the increasing popularity of multi-processor systems (e.g., multi-core processors) used in computers and computer systems, there are more opportunities to process XML documents in parallel. Parallel processing of XML documents can be difficult, however. For example, an XML document typically must be partitioned in order to achieve parallel processing, and this partitioning generally requires pre-processing (pre-parsing) of the document in order to determine the schema and thus appropriate partition points within the document. Because pre-parsing cannot be performed in parallel, the pre-parsing step itself creates significant performance overhead on systems, and limits the advantages of parallel processing.