The present invention relates to optimizing processing of electronic documents, such as Extensible Markup Language (XML) documents or similar electronic documents, and more particularly to a method and system for effective schema generation via programmatic analysis for optimizing the processing of electronic documents.
Optimization of processing electronic documents, such as XML documents, can have a dramatic impact on runtime efficiency and reduce memory requirements as well as other benefits. Known XML optimization techniques, such as efficient parser generation, XML shredding, and input-specialization require a description of the expected XML documents in the form of an XML Schema or the equivalent. From this input description or schema, specialized code or data representations may be generated that are specifically optimized for the particular class of XML input documents. However, in practice, XML Schema or input descriptions are often unavailable to perform such optimizations, inapplicable, or they may not exist at all. Some input documents may be merely well-formed and not required to be valid instances of specific schemas. Processing a document may be desired even though a faulty instance of the documents nominal schema is all that is available.