1. Field of the Invention
This invention relates to XML index scans and more particularly relates to skipping XML index scans with common ancestors of a previously failed predicate based on feedback received from a query evaluation.
2. Description of the Related Art
XPath is an expression language optimized for addressing elements of an XML document. The XML document may be analyzed as an XML tree by placing each element of the XML document as a node in the XML tree. The XML tree will include parent-child nodes directly related to the nested elements in the XML document. XPath expressions describe a path in the XML tree.
An XML index scan identifies paths of an XML document that satisfy a search query (typically an XPath expression). These paths are identified by searching an XML index. The search query may be an XML query designed to locate one or more entries in the XML document using one or more search values or predicates. The XML index includes entries that reference a path in the XML document, a node identifier for the path, a document identifier for the XML document, and a value in the XML document located by the path.
XPath expressions, also termed XML query expressions, may be derived from the XML query and used to locate elements that satisfy one or more search predicates of the XML query in the XML document. Accordingly, there can be one or more XML query expressions to completely describe the XML query. Subsequently, an XML index scan may filter the entries of the index by matching the index entries value against a predicate of the XML query. The XML index scan may provide the information of one of the filtered entries to an XPath evaluation component to further qualify the path against remaining predicates of the XML query. The XPath evaluation component uses this information in conjunction with remaining XML query p to traverse the XML document, to identify remaining predicates of the XML query, to locate a value in the XML document located by one of the remaining XML query expressions, and to determine if the value matches the XML query.
XML index scans filter entries in the index against a predicate of the XML query. Filtered entries are passed to the XPath evaluation component to further qualify the path against remaining predicates of the XML query. A lack of efficiency arises when the XML index scan locates and passes to the XPath evaluation component a path that contains the same qualities as a previously disqualified path. In this situation, the XPath evaluation component evaluates substantially redundant paths and disqualifies both of them for the same content. Consequently, it is possible that every path of a sub tree will be disqualified for the same reason; nevertheless every path in the index is still evaluated.
For example, an XML document describing purchase orders may contain several purchase orders. Each “po” element may have, as child nodes, a “billTo” element and “items” element. The “billTo” element may have, as child nodes, a “purchaserName” element and a “purchaserAddress” element each containing a value. Likewise, each “items” element may have, as child nodes, an “itemName” element with child nodes “productNumber,” “quantity,” and “price” each containing a value.
An XML query may search for a name under the “billTo” element and a price under the “itemsName” element. If the XML index includes entries whose paths lead to a “price” node, then the entries whose value further matches the predicate related to price will be sent to the XPath evaluation component even if the whole “po” sub tree should be avoided because it is the wrong “purhaserName” under the “billTo” element based on a different predicate.
From the foregoing discussion, Applicant asserts that a need exists for an apparatus and method that skips certain entries provided by an XML index scan. Beneficially, such an apparatus and method would save time by not processing disqualified XML document paths and thereby provide increased system throughput.