The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Relational and object-relational database management systems store information in tables, where a piece of data is stored at a particular row and column. To retrieve data, queries that request data are submitted to a database server, which evaluates the queries and returns the data requested.
Queries submitted to the database server must conform to the syntactical rules of a particular query language. One popular query language, known as the Structured Query Language (SQL), provides users a variety of ways to specify information to be retrieved. In SQL and other query languages, queries may have expensive expressions, particularly expensive predicates, which may, for example, take the form of subquery expressions, user-defined operators, PL/SQL functions, or other types of expressions that are relatively expensive to evaluate.
Consider a query statement, Q1, as follows
SELECT prod_idFROM (SELECT prod_id  FROM Prod_id_table  WHERE P_expensive_1) prod_id_viewWHERE P_cheap;Q1 is a containing query with respect to the subquery within the parentheses. Q1 has a filter predicate, P_cheap.
The subquery within the parentheses in Q1 has a filter predicate in the subquery “WHERE” clause. The term “filter predicate” refers to any non-join predicate. This filter predicate in the subquery has an expensive expression, P_expensive—1, which may, for example, take a form as follows:
(prod_name IN (SELECT prod_nameFROM Prod_name_tableWHERE Prod_source = ‘mega_supplier’))
A technique for evaluating queries, referred to herein as the early evaluation technique, evaluates filter predicates as early as possible, in order to reduce the amount of data processed in later operations such as sorts and joins.
The early evaluation technique, however, is not optimal if filter predicates comprise expensive expressions, as the filter predicates will be evaluated for every row in a table at the outset of the query execution and this evaluation may be exorbitantly expensive if the table contains a large number of rows, such as millions. While early evaluation may theoretically reduce the number of rows for later operations, this row reduction may not be worthwhile if the cost of the predicate evaluation is high. The benefit of the row reduction is further reduced if the filter predicates comprising the expensive expressions are not selective, i.e., incapable of significantly reducing the number of rows.
For example, in a query such as Q1, a subquery has a predicate relating to an expensive expression and the containing query has a “cheap” and selective predicate. A cheap predicate is a filter predicate comprising cheap expressions. The early evaluation technique is also not optimal. In Q1, the early evaluation technique requires that P_expensive—1 be evaluated for every row in Prod_id_table at the outset of the query execution. Depending on the selectivity of the predicate relating to P_expensive—1, the early evaluation technique can reduce the number of the rows processed by later operations performed during the computation of query Q1. However, the reduction may not sufficiently compensate for the high cost of evaluating P_expensive—1.
Furthermore, the early evaluation technique is not optimal in situations where an expensive expression appears in a SELECT list, instead of a predicate. Consider a query, Q2, as follows:
SELECT *FROM (SELECT E_expensive (c, d)  FROM Prod_id_table  WHERE <predicates>) vWHERE P_cheap;Q2 contains a subquery within the parentheses. E_expensive in the subquery SELECT list is not only an expensive expression but also a function of columns c and d in table Prod_id_table.
Given Q2, no matter how selective the predicate P_cheap in the containing query may be, the early evaluation technique still evaluates E_expensive for every row in table Prod_id_table inside the subquery so long as the row satisfies the subquery predicate. As in the case of Q1, Q2 becomes costly to run in situations where table Prod_id_table contains a large number of rows that satisfy the subquery predicate.
Based on the discussion above, there is clearly a need for techniques that overcome the shortfalls of early evaluation technique.