The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Relational database management systems (RDBMS) store information in tables, where each piece of data is stored at a particular row and column. Information in a given row generally is associated with a particular object, and information in a given column generally relates to a particular category of information. For example, each row of a table may correspond to a particular employee, and the various columns of the table may correspond to employee names, employee social security numbers, and employee salaries.
A user retrieves information from and makes updates to a database by interacting with a database application. The user's actions are converted into a query by the database application. The database application submits the query to a database server. The database server responds to the query by accessing the tables specified in the query to determine which information stored in the tables satisfies the query. The information that satisfies the query is retrieved by the database server and transmitted to the database application. Alternatively, a user may request information directly from the database server by constructing and submitting a query directly to the database server using a command line or graphical interface.
Queries submitted to the database server must conform to the syntactical rules of a particular query language. One popular query language, known as the Structured Query Language (SQL), provides users a variety of ways to specify information to be retrieved. Another query language based on the Extensible Markup Language (XML) is XML Query Language (XQuery). XQueryX is an XML representation of the XQuery language. XQuery is described in “XQuery 1.0: An XML Query Language.” W3C Working Draft Jul. 23, 2004 at www.w3.org/TR/xquery. XQueryX is described in “XML Syntax for XQuery 1.0 (XQueryX).” W3C Working Draft 19 Dec. 2003 at www.w3.org/TR/xqueryx. Another related technology, XPath, is described in “XML Path Language (XPath) 2.0.” W3C Working Draft 12 Nov. 2003 at www.w3.org/TR/xpath20. XQuery and XQueryX may use XPath for path traversal.
In a data-integration environment, an XQuery engine typically runs in the middle-tier engine and offers XQuery service for applications by evaluating the XQuery against various back-end XML data sources. One XML data source may be a simple file system repository storing XML documents as plain files. Another XML data source may be a relational database management system (RDBMS) whose data can be reformatted into XML and returned to the middle-tier engine. The RDBMS is not capable of processing XQuery operations. Therefore, the constructed XML must be returned to the middle-tier engine so that the middle-tier engine may perform the XQuery operations. Another XML data source may be an SQL/XML enabled RDBMS which can natively process XQuery. A further XML data source may be an SQL/XML enabled RDBMS that embeds a file-system repository that contains XML documents.
The XQuery engine running on the middle-tier engine will evaluate the XQuery by pulling the data from the backend XML data sources and processing the XQuery operations against the retrieved XML data. This “one-size-fit-all” approach handles all XQuery operations in the middle-tier XQuery engine. This approach may be inefficient since much of the data retrieved from the XML data source will be filtered once the middle-tier processes the XQuery operations on the data from the XML data source. Therefore, the retrieval of the data from the XML data source may waste considerable bandwidth. Also, because the XML data must be constructed (from the underlying representation into XML) and sent to the middle-tier engine, the middle-tier engine cannot optimize execution of the XQuery operations based on the original storage configuration.
Therefore, there is clearly a need for techniques that overcome the shortfalls of the approach described above.