The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Relational database management systems store information in tables, where each piece of data is stored at a particular row and column. Information in a given row generally is associated with a particular object, and information in a given column generally relates to a particular category of information. For example, each row of a table may correspond to a particular employee, and the various columns of the table may correspond to employee names, employee social security numbers, and employee salaries.
A user retrieves information from and submits updates to a database by interacting with a database application. The user's actions are converted into a query by the database application. The database application submits the query to a database server. The database server responds to the query by accessing the tables specified in the query to determine which information stored in the tables satisfies the queries. The information that satisfies the queries is retrieved by the database server and transmitted to the client application. Alternatively, a user may request information directly from the database server by constructing and submitting a query directly to the database server using a command line or graphical interface.
Queries submitted to the database server must conform to the rules of a particular query language. One popular query language, known as the Structured Query Language (SQL), provides users a variety of ways to specify information to be retrieved. In SQL and other query languages, queries may include inner query blocks. In SQL, every query has an outer query block and one or more inner query blocks. For example, the query
SELECT T1.xFROM table1 T1, parts PWHERE P.y = T1.y AND P.z = ‘MED BOX’  AND T1.quantity < (SELECT AVG (T2.quantity)    FROM Table2 T2    WHERE T2.partkey = P.partkey)  AND P.quantity < (SELECT AVG (T3.quantity)    FROM Table3 T3    WHERE T3.serialnum = T1.serialnum);has two inner query blocks, each of which is subqueries:
SELECT AVG (T2.quantity) FROM Table2 T2 WHERET2.partkey = P.partkeyandSELECT AVG (T3.quantity) FROM Table3 T3 WHERET3.serialnum = T1.serialnum.
In some database systems, a query submitted by a user may be manipulated in order to determine a lower cost, semantically equivalent form of the query. The lower cost semantically equivalent version of the query is then executed in place of the original query. Since the lower cost query is semantically equivalent to the original query, the same results are produced. However, since the chosen semantically equivalent query has a lower cost than the original query, the same results are obtained with less computational cost than would have been incurred if the original query were executed.
In order to determine the lowest cost, semantically equivalent query, the costs for two or more semantically equivalent queries must be determined. In one approach to determining the costs, a set of candidate, semantically equivalent queries are determined and the costs for each of these queries are determined. The problem with this approach, however, is that determining the costs for each query is, in itself, a computationally expensive undertaking. In the example above, one or both of the subqueries may be nested (remain in the original form as above), unnested by merging these query blocks into the outer query, or unnested by generating inline views. Furthermore, if these inline views are mergeable, then they may be merged into the outer query block. Therefore, even in this simple case there will be up to eight different possible choices for alternative, semantically equivalent queries and therefore, determining the costs for the alternative queries may be prohibitively expensive.
Based on the discussion above, there is clearly a need for techniques that overcome the shortfalls of previous approaches.