Complex queries are often encountered in larger database systems because of a desire to extract a significant amount of information in a single query. In such queries, it is common to find subqueries and aggregation functions. It is also common with such queries to find that the subquery is a correlated subquery. A correlated subquery is one in which the values from an outer query block are used to compute the inner subquery.
By way of example, a database might have two tables: an employee table and a department table. A query of this database may seek to identify the employees from a specific location and across all departments who have a salary greater than the average salary in their particular department. Such a query may take the form:
QUERY ONESELECT emp_id, emp_name, dept_nameFROM employee E, department DWHERE E.dept_num = D.dept_num ANDE.state = ‘CALIFORNIA’ ANDE.salary >(SELECT AVG (salary)FROM employee E1WHERE E1.dept_num = D. dept_num)
The main query block joins the employee and the department tables since the employee names and department names must be retrieved from columns in the respective tables. The subquery regarding the salary predicate is a correlated subquery through the equality predicate involving D.dept_num. For each employee in the outer query, the average salary for all the employees in that particular department is calculated in the subquery to determine if the employee's salary is greater than the average. Consequently, the employee table is read multiple times for both the outer query block and the subquery block.
In a partitioned (shared nothing) environment, this “tuple-at-a-time” approach to query processing results in significant network traffic and poor processing speed.
One improvement to the tuple-at-a-time approach is to employ “Magic Decorrelation”. This method of query transformation changes the query into a decorrelated query that is structured to first extract a set of relevant keys based upon the outer query block predicates and to then materialize a superset of distinct values based upon the subquery predicates and the set of relevant keys. The decorrelated query then uses this materialized set of values to perform the outer block selection.
While Magic Decorrelation often achieves superior performance as compared to the tuple-at-a-time approach, the decorrelated query still accesses tables multiple times. What is therefore needed is a method for further reducing the number of times a table or view is accessed when executing a query. The need for such a system has heretofore remained unsatisfied.