The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Database systems often store within their databases XML-formatted data. This data may come from a variety of sources, though the source is often an XML document or a database object.
In XML, data items known as elements are delimited by an opening tag and a closing tag. An element may also comprise attributes, which are specified in the opening tag of the element. Text between the tags of an element may represent any sort of data value, such as a string, date, or integer.
Text within an element may alternatively represent one or more elements. Elements represented within the text of another element are known as subelements or child elements. Elements that store subelements are known as parent elements. Since subelements are themselves elements, subelements may, in turn, be parent elements of their own subelements. The resulting hierarchical structure of XML-formatted data is often discussed in terms akin to those used to discuss a family tree. For example, a subelement is said to descend from its parent element or any element from which its parent descended. A parent element is said to be an ancestor element of any subelement of itself or of one of its descendant element. Collectively, an element along with its attributes and descendants, are often referred to as a tree or a subtree.
XML Schema is a definition language that provides facilities for describing structure and constraining the contents of an XML document. A draft specification, referred to hereinafter as “XML Schema Specification”, for the XML Schema definition language is described in a set of three documents published by the W3C Consortium. The first document in the set is “XML Schema Part 0: Primer Second Edition”, W3C Recommendation 28 Oct. 2004, located at “http://www.w3.org/TR/xmlschema-0/”, the entire contents of which are hereby incorporated by reference for all purposes as if fully set forth herein. The second document in the set is “XML Schema Part 1: Structures Second Edition”, W3C Recommendation 28 Oct. 2004, located at “http://www.w3.org/TR/xmlschema-1/”, the entire contents of which are hereby incorporated by reference for all purposes as if fully set forth herein. The third document in the set is “XML Schema Part 2: Datatypes Second Edition”, W3C Recommendation 28 Oct. 2004, located at “http://www.w3.org/TR/xmlschema-2/”, the entire contents of which are hereby incorporated by reference for all purposes as if fully set forth herein. As referred to herein, an XML schema is a defined structure for XML documents. An XML schema representation is data that describes the XML structure. An XML schema representation may include an XML document with declarations and/or a tokenized XML representation which is one for which tokens have been generated. An example of an XML schema representation includes, but is not limited to, an XML document with type definitions, element declarations, or attribute declarations.
It is important for object-relational database systems that store XML data to be able to execute queries using XML query languages. XML Query Language (XQuery) and XML Path Language (XPath) are important standards for a query language, which can be used in conjunction with SQL to express a large variety of useful queries. XPath is described in XML Path Language (XPath), version 1.0 (W3C Recommendation 16 Nov. 1999), herein incorporated by reference and available at the time of writing at http://www.w3.org/TR/xpath, as well as in XML Path Language (XPath) 2.0 (W3C Recommendation 23 Jan. 2007), herein incorporated by reference and available at the time of writing at http://www.w3.org/TR/xpath. XQuery is described in XQuery 1.0: An XML Query Language (W3C Recommendation 23 Jan. 2007), herein incorporated by reference and available at the time of writing at http://www.w3.org/TR/xquery.
Some techniques for evaluating XML queries rely on normalizing an XML query to form a set of simple XPath expressions. The XPath expressions are then evaluated against a streamed XML data source using techniques that may be collectively referred to as streaming evaluations. Streaming evaluation techniques involve an XML event-streaming component and an XPath evaluation component. The event-streaming component parses an XML input stream and generates XML events for each element or attribute it finds in the XML data stream. It streams these events to the evaluation component, which evaluates the events to determine if they match a next unmatched step (i.e. constraint) in the XPath expression. One such streaming evaluation technique is discussed in “Technique To Estimate The Cost Of Streaming Evaluation Of XPaths,” incorporated above.
Another streaming evaluation technique involves compiling one or more XPath expressions into a state machine, such as a non-finite automaton (NFA). The state machine functions as an evaluation component. The states and state transitions of the state machine reflect each constraint in the set of XPath expressions. Based on events received from the event-streaming component, the state machine transitions between its various states. When the state machine is in an accepting state, it generates an XPath result for the set of XPath expressions.
In some cases, an XML event-streaming component must also function as an XML decoder. This is because many database systems binary-encode XML data, as taught in, for example, “TECHNIQUES FOR EFFICIENT LOADING OF BINARY XML DATA,” incorporated above. An XML event-streaming component must decode the binary-encoded XML input stream into a textual representation before it can interpret the XML data inside that stream. Only then can it recognize elements and attributes. Typically, a binary-encoding for an XML data source is based on an XML Schema. Thus, the XML decoder will utilize an XML Schema to decode the binary-encoded XML data.
It is desirable to optimize streaming evaluation techniques in order provide more efficient evaluation of XPath expressions in a database system. Increased efficiency may allow for faster streaming evaluations, less demand on computer resources during streaming evaluation, or both.