1. Field of the Invention
This invention relates in general to database management systems performed by computers, and in particular, to the optimization of queries that include self joins.
2. Description of Related Art
Since its introduction, XML, the eXtended Markup Language, has quickly emerged as a universal format for publishing and exchanging data over the World Wide Web. However, problems still exist in publishing data from object-relational databases as XML documents.
In the business-to-business e-commerce area, there is a widely recognized need to create XML documents by combining one or more object-relational tables, e.g., by creating an XML purchase order by joining a customer with information drawn from other tables. A relational join is a well known operation that combines information from two base tables by creating pairs of matching rows that are related using a predicate.
In comparison with simple select queries over a single table, join queries are costly in terms of system performance and much research has been done to optimize these queries. Query rewrite optimizations can sometimes be used to transform join queries into simple select queries.
For example, if a join is a self join, and the join predicate links the two quantifiers on the table's key columns, the query can be rewritten into a simple select. This is illustrated using the following example:
SELECT E1.SAL, E2.SAL
FROM EMP E1, EMP E2
WHERE E1.NO=E2.NO
The query selects the salaries of employees on matching values of attribute NO. Since NO represents employee numbers and it is also the table's key, each row represented by quantifier E1 will only match with itself in E2. Therefore, the above query can safely be rewritten as the following query, preserving its semantics:
SELECT E1.SAL, E1.SAL
FROM EMP E1
The principle of self join elimination can be extended beyond simple queries, such as the example above, to include more complex queries. Commercial database management systems, such as DataBase 2 (DB2™) Universal DataBase (UDB™) sold by IBM Corporation, the assignee of the present invention, implement a number of different query rewrite transformations, which enhances the performance of such queries.
However, there is still a need for improved techniques at optimizing self joins. Specifically, there is a need in the art for transformations of self joins that are transitively derived through table expressions which themselves cannot be simplified using a SELECT-MERGE query rewrite optimization.