1. Field of the Invention
The present invention relates generally to data processing environments and, more particularly, to a database system with methodology for generating bushy nested loop join trees.
2. Description of the Background Art
Computers are very powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as “records” having “fields” of information. As an example, a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software cushion or layer. In essence, the DBMS shields the database user from knowing or even caring about the underlying hardware-level details. Typically, all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth, all without user knowledge of the underlying system implementation. In this manner, the DBMS provides users with a conceptual view of the database that is removed from the hardware level. The general construction and operation of database management systems is well known in the art. See e.g., Date, C., “An Introduction to Database Systems, Seventh Edition”, Part I (especially Chapters 1-4), Addison-Wesley, 2000.
One purpose of a database system is to answer queries requesting information from the database. A query may be defined as a logical expression over the data and the data relationships set forth in the database, and results in the identification of a subset of the database. Consider, for instance, the execution of a request for information from a relational DBMS. In operation, this request is typically issued by a client system as one or more Structured Query Language or “SQL” queries for retrieving particular data from database tables on a server. The syntax of SQL is well documented, see e.g., “Information Technology—Database languages—SQL”, published by the American National Standards Institute as American National Standard ANSI/ISO/IEC 9075: 1992, the disclosure of which is hereby incorporated by reference.
SQL queries express what results are requested but do not state how the results should be obtained. In other words, the query itself does not tell how the query should be evaluated by the DBMS. Rather, a component of the DBMS called the optimizer determines the “plan” or the best method of accessing the data to implement the SQL query. The query optimizer is responsible for transforming an SQL request into an access plan composed of specific implementations of the algebraic operator selection, projection, join, and so forth. The role of a query optimizer in a relational DBMS system is to find an adequate execution plan from a search space of many semantically equivalent alternatives.
Modern query processing relies on the theoretical basis of any RDBMS: relational algebras. The query execution plan of a query is an implementation of the relational algebra expression that describes that query. For instance, consider the following query:
SELECT r.a, SUM (s.a) FROM r, s
WHERE r.b=s.b AND r.c=0
GROUP BY r.a
The above query can be described by the following canonic relational expression:
(group <r.a; sum(s.a)>  (filter < r.b = s.b and r.c = 0 >   (join <true>    r    s   )  ))
Optimization is the process of finding the best query execution plan (QEP) with the same semantics (i.e., result) as the SQL query. This can be formalized as the search for the best physical relational expression that is equivalent to the canonic relational expression of the SQL query. The above expression is called “canonic” as it is based on the SQL standard that specifies the semantics of declarative SQL queries by functional means: it gives the combination of relational operations that produce the outcome of the query. The optimizer typically uses logical operators for the canonic relational representation of the query. The logical operators are called “logical” as they refer to pure mathematical concepts: there is no assumption about the algorithm that would implement a logical operator, nor about the representation of a derived table.
The optimizer than builds physical operator plans or plan fragments which are semantically equivalent to the canonic relational expression, and compares the estimated costs of alternative plans or plan fragments. The physical operators are called “physical” as they correspond to the algorithms and data structures implemented by the execution engine of the database system. In general, several physical operators are associated with a given logical operator, as several algorithms implement a given algebraic relational operator. For instance, “join” is a common operation in an RDBMS. A database query may specify a “join” between two (binary join) or more (nary join) tables. Each table itself is composed of rows and columns. A join logical operator is implemented by NestedLoopJoin, MergeJoin, and HashJoin. The optimizer compares alternatives and finally retains the best physical plan that fully implements the query, with the best physical plan being selected based on estimated execution costs. For example, for a given query the optimizer may determine that it is most advantageous to implement a join logical operator using a NestedLoopJoin physical operator. Optimization and execution of queries involving joins have been extensively discussed in the literature. See, e.g., Selinger, Patricia G., et. al., “Access Path Selection in a Relational Database Management System,” ACM SIGMOD Conference, pp. 23-34, 1979 which deals with finding optimal join orders and join methods to use. See, e.g., Shapiro, Leonard D., “Join Processing in Database Systems with Large Main Memories,” TODS 11(3), pp. 239-264, 1986 and Graefe, Goetz, et. al., “Hash Joins and Hash Teams in Microsoft SQL Server,” VLDB, pp. 86-97, 1998 which deal with merge joins and hash joins. See, e.g., Roussopoulos, Nick and Kang, Hyunchul, “Pipeline N-way Join Algorithm Based on the 2-way Semijoin” which deals with use of semi-join based approaches to process multijoin queries efficiently. See, e.g., Dewitt, David J., et. al., “Nested Loops Revisited,” PDIS, pp. 230-242, 1993 which deals with parallelization of joins. The disclosures of the foregoing are hereby incorporated by reference. For further description of nested loop joins, see e.g., commonly-owned U.S. Pat. No. 6,493,701 by Ponnekanti, Nagavamsi titled “Database system with methodogy providing faster N-ary nested loop joins”, the disclosure of which is hereby incorporated by reference.
However, there are issues in implementing certain operators in current optimizers. Consider, for example, the following query:
SELECT * FROM r, s
WHERE r1=s1
The query execution plan for this query may be implemented with a NestedLoopJoin (NLJ) that has table r on the outer side and table s on the inner side. The NLJ would obtain each and every outer tuple in table r. For each outer tuple that qualifies, values in those columns are obtained. Next, for each outer tuple a full scan of the inner side (table s) is performed. If no index is available on the inner side, this may involve a line-by-line scan of each row of the inner table. If an index were available on the inner side, then instead of scanning every line, only the lines that are qualified by the index would be scanned. In either case, for each and every outer tuple, a scan is opened on the inner side to scan all of the tuples that qualify.
During a scan of table s, the outer tuple is stable (i.e., it is not changed or advanced during the scan on the inner side) and the current value of r1 is the same. Accordingly, a “pushdown” can be performed on the inner side by pushing down the value of the current tuple of r on the current inner side of s. This pushdown enables the value of the current tuple of r to be used in qualifying the tuples of s that have s1=r1. In other words, during the scan of s, the join clause that was s1=r1 becomes a search argument (s1=constant). This is the case because for the entire duration of the scan of s, r1 is constant.
One limitation of many current optimizers having a bottom-up search engine is that they can only perform a one-level pushdown. The inner side of a NLJ can include a scan of a table (either table scan or index scan), but implementation of more complex structures, such as a join, on the inner side is problematic in current optimizers, particularly if the optimizer employs a “bottom-up” search strategy.
Many optimizer search engines operate in a bottom-up fashion by incrementally building small plan fragments (e.g., single table scan, two table joins, three table joins, and so forth) starting with the leaves. The plans and permutations are generated bottom-up, the costs are estimated bottom-up, and properties (if any) are also propagated bottom-up. The optimizer thus gradually builds alternative total plans (i.e., plans that have the same semantics as the SQL query and are candidates to become the final query execution plan) and estimates their costs using a cost model. The optimizer typically keeps the partial plans in a cache which is referred to as the “plan cache”.
In order to implement an inner bushy tree structure inside a NLJ, a bottom-up search engine would have to start incrementally building small plan fragments on the inner side. However, a problem in doing so is that when the small plan fragments are built on the inner side, the search engine does not yet know what the outer side will be. Typically, the outer side will be found later when that specific inner side is being put inside the NLJ. However, at the point the plan fragments are built on the inner side, the outer side that would do the pushdown is not yet known. Similarly, the predicates that would become search clauses as a result of the pushdown are also unknown. As the predicates are not yet known, costs also cannot be determined. The optimizer cannot determine a cost yet because it does not really know the behavior. This also means that the optimizer cannot have equivalence class competition as it cannot prune away plan fragments that are not promising (i.e., those with higher costs). As a result of these factors, attempting to do a deep pushdown on the inner side of a NLJ in current bottom-up search engines results in exponential complexity (both in space and in time) in terms of the number of plan fragments that are generated.
For these reasons, bottom-up search engines currently generate only simple leaves and not inner bushes on the inner side of a NestedLoopJoin. In contrast, a search engine can typically generate bushy trees for MergeJoin and HashJoin. To have bushy trees with a NLJ, the search engine typically must use a stored index or a multi-table reformatting. This means that it must store the results of an n-table subjoin in a work table that has an index which is then used inside a NLJ. This approach of building a stored index is an expensive operation as building the stored index and materializing a result set involves logical and physical I/O (input/output). The problem with an approach of generating only simple leaves on the inner side of a NLJ is that some queries (e.g., queries that have outer joins) do need, given the semantics of the query, a bushy tree structure on the inner side. In some cases, the best possible plan uses a NLJ and requires a bushy tree structure on the inner side.
One approach for generating a bushy tree in a NLJ is to pass local information in the plan. This involves telling a join node that given that the outer join did not generate any tuples, it is necessary to generate the null substitution tuple (or “phantom tuple”). This approach of passing local information in the plan is inconsistent with the use of a modern, operator spaced query plan in which each operator is independent of each other operator in the query plan. These operators are designed to be “plug and play” in that any operator can be put over (and under) any other operator in a plan. Accordingly, it would be desirable to use an approach that avoids use of the phantom tuple in a conceptually acceptable manner.
Another issue to be addressed is that when a bushy tree structure is required by the semantics of a query, generating this structure currently results in increased complexity of the search space. In general, bushy trees utilize more memory than left deep trees in the plan cache. In the case of a database installation in which memory is limited, it would be advantageous to generate a bushy tree structure without increasing the space usage of the plan cache.
What is needed is a method for building a bushy tree inside a NLJ in an optimizer that uses a bottom-up search strategy. The solution should build a bushy NLJ tree that correctly captures the semantics of the query while avoiding the expense of building a stored index. Ideally, the solution should also be implemented in a manner which does not require an increase in space utilization during query optimization. The present invention provides a solution for these and other needs.