1. Field of the Invention
This invention relates to XML (Extensible Markup Language) Schema, and particularly to a method for constructing a linear-sized validation plan of W3C (World Wide Web Consortium) XML Schema Grammars.
2. Description of Background
Existing techniques for optimized parsing and validation leverage widely known optimization methods by converting a given grammar into forms of well-understood finite state machines. The content-model definition language of XML (Extensible Markup Language) Schema, however, is not easily converted into such structures. Content models defined in XML Schema can compactly represent a wide array of content-model constraints. In particular, three styles of composition (i.e., sequence, choice, and all) are supported, as well as arbitrary occurrence bounds. The expressivity of this model allows the creation of highly complex content models in a relatively compact form. This complexity is at odds with the traditional models. Particles composed with the “all” compositor, for example, result in a combinatorial explosion of states in such graphs, and simple occurrence ranges are represented with a state for each iteration of a repetition.
While many of the excess states in the finite-state model may eventually collapse into a relatively simple execution plan, their construction and optimization wastes computing and memory resources during compile time, and potentially prevents completion of the compile. Furthermore, if the optimizations are poorly implemented, artifacts of the blowup may appear in the final execution plan, affecting runtime performance. Considering the limitations of these finite-state models, it is desirable, therefore, to formulate a method for constructing a linear-sized validation plan of W3C (World Wide Web Consortium) XML Schema Grammars.