For a variety of reasons, it is advantageous to store XML documents within a database. An XML document is a document that conforms to the XML standard. An XML document is typically composed of a set of nodes arranged in a hierarchy. Each node of a XML document may be composed of a set of one or more tags, and each node may have a set of associated attributes. A node may also be associated with a portion of the text of the XML document.
Once a set of XML documents are stored within a database, it would be advantageous to use an XML query language to retrieve, from the database, those XML documents that match a set of search criteria. An XML query language is a language that allows an operation, such as a search, to be performed on one or more XML documents, to be expressed. Illustrative examples of an XML query language are XPath and XQuery. To support the demand of retrieving and storing XML data to and from relational databases, an industry standard (SQL/XML) has been developed to allow SQL to operate on XML.
An SQL/XML query may include XPath based operation, such as EXTRACT, EXISTNODE, and EXTRACTVALUE, which operate on a portion of an XML document indicated by an XPath expression provided as an argument to the operator. EXISTNODE returns one value (e.g., 0) if there is no XML element at the position in the hierarchy indicated by the XPath expression, and a different value (e.g., 1) otherwise. EXTRACT returns a data stream representing a portion of the XML document that include and descend from the XML element or elements indicated by the XPath expression. EXTRACTVALUE returns a scalar value, if any, from the XML element indicated by the XPath expression.
When a SQL command contains an XML expression, prior to executing the SQL command, the DBMS may convert data, stored within the DBMS, to an XML form, and send the XML form of the converted data to the process that implements the XPath operation. The XPath operation process parses the data to identify and return the indicated information. This process can be wasteful if only a portion of the converted data, stored separately in one or more columns of a relational or object-relational database, affects the results. It would be desirable to extract only data from the columns of interest with an SQL query. In addition, the use of an SQL query enables further SQL optimizations that fully exploit the object-relational storage. Such optimization may not be available during parsing by an XPath operation. Based on the foregoing, there is a clear need for a mechanism to rewrite a query containing an XML expression, directed to an XML type object-relational construct, as a SQL query.
Techniques for retrieving XML documents, which are stored in a database, using a database command, containing an embedded XML query language expression, are disclosed in the query rewrite patent (identified above in the section entitled Related Application Data). According to these techniques, a determination is made as to whether an embedded XML query language expression in a received database command may be transformed (rewritten) into a database operation. If it is determined that the embedded XML query language expression can be transformed, then the embedded XML query language expression is rewritten to be expressed as a database operation that does not involve the embedded XML query language expression.
While these techniques enable certain database commands containing embedded XML expressions to be processed more efficiently by the DBMS, these techniques do not address the problem of transforming database commands containing embedded XML expressions, when the embedded XML expressions define a search for a set of XML documents that match a set of specified search criteria.
Consequently, what is needed is an approach for performing the efficient integration of full-text searching, using an XML expression, of XML documents, stored within a database, with query rewriting techniques. The approaches described in this section are (a) approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued, or (b) approaches that have been developed either by the inventors of the present application or internally within the assignee of the present application. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.