The Extensible Markup Language (XML) is an increasingly popular standard for data and documents that is finding wide acceptance in the computer industry. Customers want to expose data as XML to work with XML tools and applications. However, a significant fraction of enterprise data resides in relational and object-relational databases, which are queried using the Structured Query Language (SQL). This data is typically, but not limited to, legacy data that has been stored and maintained over years, or even decades, using the relational data model.
There have been proposals on XML Generation, specifying how relational or object-relational data can be flexibly mapped to XML (see material at www.sqlx.org), and XML Querying, flexibly expressing queries over the underlying relational or object-relational data, in the SQL environment. However, storing and retrieving the data en bloc is not enough.
It is important to be able to efficiently execute queries over the relational and object-relational data using XML query languages. XPath is an important query language, which can be used in conjunction with SQL to express a large variety of useful queries.
Various approaches have been developed for executing XPath queries over relational and object-relational data exposed as XML. One approach for executing XPath queries over relational and object-relational data exposed as XML is referred to herein as the “dynamic-materialization” approach. According to the dynamic-materialization approach, the XML is materialized (e.g. as a DOM) from the relational/object-relational data in response to receiving an XPath query, and then the XPath query is executed over the dynamically-materialized XML. Unfortunately, materializing the entire XML document is extremely expensive, and it involves the use of large amounts of additional memory, CPU cycles, disk space (swap space) and other resources.
Another approach for executing XPath queries over relational and object-relational data exposed as XML is referred to herein as the “migration” approach. The migration approach involves migrating such data into (1) a different native XML database, or (2) XML-specific structures within a relational database. For example, once converted, the XML data may be stored in schema-based or non-schema-based XMLType tables. However, the migration approach involves data migration, which is expensive and often not feasible, since customers are not open to migrating systems with legacy data from several years.
Another approach for executing XPath queries over relational and object-relational data exposed as XML is referred to herein as the “middle-tier” approach. The middle-tier approach involves fetching the objects and relational data into a middle-tier layer, such as an application server, and performing XML manipulation there. Because the middle tier approach involves performing sophisticated XML manipulation outside of the database system, an inefficiently large amount of data may need to be fetched into the middle-tier.
Another approach for executing XPath queries over relational and object-relational data exposed as XML is referred to herein as the “rewrite” approach. According to the rewrite approach, XMLType views are used to make object-relational data available as XML. XPath queries that access schema-based and non-schema-based XMLType views are then dynamically rewritten to go directly over the underlying object-relational data. Since the entire XML does not need to manifested, and instead the query is rewritten to go directly on relational data, efficient access of the relational data can lead to orders of magnitude performance gains over previous approaches. Specific techniques for implementing the rewrite approach are described in the Rewrite Application, referred to above.
Specifically, the Rewrite Application describes a query rewrite system that takes as input (1) the XPath query being rewritten, and (2) the XMLType view or subquery over which the XPath query is being executed. Based on these inputs, the rewrite system rewrites the XPath query to directly access the underlying relational structures that contain the data exposed through the XMLType views.
For example, an XMLType view may be defined as follows:
create type emp_t as object ( EMPNO  NUMBER(4),  ename  VARCHAR2(10),  job  VARCHAR2(9),  mgrNUMBER(4), HIREDATE DATE);create type emp_list is varray(100) of emp_t;create or replace type dept_t as object (“@DEPTNO” NUMBER(2), DeptNAME VARCHAR2(14), LOC VARCHAR2(13), employees emp_list);create view dept_ov of dept_t with object id (deptname) as  select deptno, dname, loc, CAST(MULTISET(   select emp_t(empno, ename, job, mgr, hiredate)   from emp e where e.deptno = d.deptno) AS emp_list) from dept d;create view dept_xv of xmltype  with object id(SYS_NC_ROWINFO$.extract(‘/ROW/@DEPTNO’).getnumberval( )) as  select SYS_XMLGEN(VALUE(p)) FROM dept_ov p ;
Based on this XMLtype view definition, the query rewrite rules described in the Rewrite Application may be used to rewrite the following XPath-based query (Q1):
SELECT  extractvalue(value(p),‘/ROW/DEPTNAME’)  DEPARTMENTNAME  fromdept_xv p where extract(value(p), ‘/ROW/@DEPTNO’) = 2134;into the following SQL query (Q2):
SELECT   SYS_ALIAS_1.DNAME   “DEPARTMENTNAME” FROM DEPTSYS_ALIAS_1 WHERE SYS_ALIAS_1.DNO =2134;
In this example, two separate sections of Q1 specify XPath operations. Specifically, “extractvalue(value(p),‘/ROW/DEPTNAME’)” specifies one XPath operation, and “extract(value(p), ‘/ROW/@DEPTNO’)” specifies another XPath operation. Sections of a query that specify an XPath operation shall be referred to herein as “XPath sections”.
Each XPath section uses an XPath string to identify the XML data upon which an operation is to be performed. For example, in Q1, the XPath string for XPath section “extractvalue(value(p),‘/ROW/DEPTNAME’)” is “/ROW/DEPTNAME”. The XPath string for the XPath section “extract(value(p), ‘/ROW/@DEPTNO’” is “/ROW/@DEPTNO’”.
In the example given above, all XPath sections of Q1have been rewritten in Q2 so that Q2 includes no XPath sections. Specifically, the XPath section “extractvalue(value(p), ‘/ROW/DEPTNAME’)” of Q1 was rewritten as “SYS_ALIAS—1.DNAME” in Q2. Similarly, the XPath section “extract(value(p), ‘/ROW/@DEPTNO’)” of Q1 was rewritten as “SYS_ALIAS—1.DNO” in Q2.
The overhead of rewriting an XPath query is incurred once per query, not once per row, since these operations are performed at compile-time. Consequently, the rewrite approach is highly efficient for XPath queries whose XPath sections can, in fact, be rewritten.
Unfortunately, not all XPath sections can be rewritten in this fashion. An XPath section may be non-rewritable for a variety of reasons. Some XPath sections are not rewritable, for example, because the XPath string contains a node that does not map to a column in an object-relational construct. In order to be “mappable,” the full structure needs to be known at compile time. The full structure may not be known at compile time because of the specific storage used, because of the XML schema, because of the XPath, or because of the XML is generated from an arbitrary function. For example, if the target of an XPath string is an element of a parent that is stored as a LOB, then the XML section cannot be entirely replaced by an SQL operation that produces the same result. An XPath string that cannot be fully mapped to a corresponding relational structure is referred to herein as an “unmappable path”.
If an XPath section specifies an unmappable path, then the XPath section is not rewritten, and the dynamic-materialization approach is used to evaluate the XPath operation specified in the XPath section. Specifically, the XML is manifested in memory (e.g. as a DOM) and the XPath is evaluated over the DOM. Unfortunately, the performance of the dynamic-materialization approach is often unacceptable.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.