XML has emerged as the dominant standard for exchanging business data over the Internet. Existing relational data must therefore be published as XML. An XML query processor can be used to publish relational data as XML; the XML query processor sits as a translation layer on top of a relational database, providing a default XML view of the underlying relational schema. Users can write XML queries over the default XML view to publish relational data as XML. The XML queries are translated to SQL queries, which are executed in the relational engine.
The underlying relational schema is often designed independently of the XML publishing requirements. Therefore, relational meta-data (schema information) must often be treated as though it is data (and vice-versa) when publishing XML documents. In other words, an XML query over the default XML view may need to query both over relational data and meta-data. Unfortunately, the underlying relational database system cannot support such queries because the SQL query language used in relational database systems, which is based on first-order logic, cannot query seamlessly across both relational data and meta-data.
Some work on higher-order query languages is known in the art. For example, SchemaSQL is essentially a higher-order extension to SQL. Techniques for implementing SchemaSQL on top of a relational database system have been described in L. Lakshmanan et al. “SchemaSQL—A Language for Querying and Restructuring Multidatabase Systems”, Proceedings of the VLDB Conference, Bombay, India, September 1996 and L. Laskshmanan et al. “On Efficiently Implementing SchemaSQL on an SQL Database System”, Proceedings of the VLDB Conference, Edinburgh, Scotland, September 1999.
Microsoft's OLEDB provides a way to describe the result “shape” of a query that is being executed by a remote data source. See “Microsoft OLEDB 2.0 Programmer's Reference and Software Development Kit”, Microsoft Press, November 1998. However, OLEDB is only for processing SQL queries, not XML queries. Moreover, in OLEDB, an execution plan (i.e., internal representation) of the query is not provided by the remote data source, which makes it impossible to do some kinds of optimizations. For example, the internal representation of a remote query cannot be grafted onto the internal representation of a local query. This makes it impossible to globally optimize the combined local/remote query before executing it.
A method of efficiently executing “higher-order” XML queries that span relational data and meta-data is therefore needed. If meta-data query processing could be tightly integrated with regular query processing over relational data, then a large part of the computation of queries over relational data and meta-data could be pushed down to the relational database engine.