The invention is in the field of data base management and particularly concerns a hybrid technique for joining at least two tables of a relational data base management system. The technique is hybrid in that it combines positive features of two established methods for joining tables.
In a relational data base system, data is stored in tabularized format. The tables of the data base consist of sets of rows which share common characteristics. For example, the Skill.sub.-- Table structure of FIG. 1 is a table 10 in which are stored the records of all employees with particular production skills. Although the table implies that the records are in sequential order in contiguous storage, in fact, they may be stored randomly. However, for clarity of the discussion in this application and to emphasize the operation of the invention described below, tabularization will be illustrated as in FIG. 1.
In FIG. 1, each horizontal slice ("row") of the table contains three data fields, empno, emp.sub.-- name, and skill. Thus, the first row in the table lists an employee whose employee number (empno) is 53, whose surname (emp.sub.-- name) is Fisher, and whose production skill is listed as "design". Conventionally, the table is formatted by sequentially numbering the assembled rows so that, from top to bottom, the rows are identified by numbers ("identifiers"). In reality, the row identifier (RID) corresponds to an offset on a "page" of storage in which the table is contained.
"Join" is one of the most powerful operators in a relational data base system. The tool enables the user to assemble new tables out of existing tables according to criteria presented to the data base system in the form of a query. For example, in FIG. 2, it is asserted that the relational data base system which contains the Table 10 of FIG. 1 also contains a Table 12 which relates the employees listed in Table 10 with their direct managers and their departments ("dept"). A user of the system presents a query asking for all employees who possess the "test" skill and whose manager is DAVIS. The resulting table is indicated by reference numeral 14.
When the tables are "joined", the rows of the resulting table 14 will contain the employee's name, manager, and skill, as illustrated in FIG. 2.
In deciding which records of the Tables 10 and 12 satisfy the predicates (i.e., conditions on values of columns), the contents of the tables are subjected to a structured examination. In this regard, the empno fields of the records are examined in a manner which treats the vertical alignment of these fields as a column. In each of the Tables 10 and 12, this column is referred to as a "join column". Thus, during the search, when the row with employee number 53 of the Table 12 is encountered and it satisfies the manager predicate, reference will be made to the empno column of Table 10 to discover the row or rows in that table for employee number 53. Once rows with identical join column values are located in Table 10, the skill condition is applied to the row in Table 10. Conditions like the latter one are referred to as "local" predicates, since they refer to only columns of a single table.
When two rows in Tables 10 and 12 are found which satisfy, first, the join predicate, and second, the respective local predicates, a new row including the desired join table fields is created from these two records and entered into the join Table 14.
Many techniques are known in the prior art for implementing, in one way or another, a join method which will combine rows of two or more tables into a join table. The assignee of this patent application, for example, provides the following data base products, each of which supports one or more join methods: DB2, SQL/DS, AS/400 Data Base Manager, and OS/2 Extended Edition Data Base Manager.
A representative syntax used for one or more of these products in stating the query given above is expressed as follows in SQL:
SELECT emp.sub.-- name, manager, skill
FROM dept.sub.-- table, skill.sub.-- table PA2 WHERE dept.sub.-- table.empno=skill.sub.-- table.empno
and dept.sub.-- table.manager="Davis" PA3 and skill.sub.-- table.skill="test"
This query joins the Tables 10 and 12 for all rows whose empno values are equal. The condition on the empno field is called a "join predicate". The columns of the Tables 10 and 12 referred to in the join predicates are the "join" columns of the tables. The predicates manager="Davis" and skill="test" are called "local predicates".
This example is not meant to be limiting. In other words, the prior art data base systems support queries which have more than one join predicate and more than one local predicate for each table. Also, plural predicates can be combined using conventional AND operators and OR operators, or a combination of operators. Furthermore, any number of fields can be retrieved for rows which satisfy join predicates to construct the resulting rows. Join may involve more than two tables.
Two specific prior art join procedures are known; each has particular strengths which make it particularly efficient for certain conditions, and each has limitations which make it expensive for other conditions. The two techniques are termed "nested loop join" and "sort/merge join".
Refer to FIG. 2 for an understanding of the prior art nested loop join technique. In FIG. 2 (and in the following Figures), the dept.sub.-- table structure 12 is termed the "outer" table, while the skill.sub.-- table structure 10 is the "inner" table. In the nested loop join technique, for each row in the outer table 12 that satisfies the local predicate or predicates, the join column or columns in the inner table are scanned for matching rows. Thus, the join column of the outer table is scanned from top to bottom to locate rows satisfying the predicate that the employee's manager be Davis. For each row in the outer table in which Davis is denoted as the manager, the join column of the inner table 10 is scanned from top to bottom if an index does not exist. During the scan, each inner table row whose join column value matches that of the outer table's row is subjected to the test of whether its skill condition is equal to test. In the nested loop scan, therefore, two nested scans are employed, the first, a scan of the outer table for rows which satisfy the local condition; then, for each of these rows, a complete join column scan of the inner Table 10 or a probe via an index on the join column is made to find matching records.
The sort/merge join of the prior art is illustrated in FIG. 3. A pre-condition of this technique is that the outer and inner tables be ordered based upon their join columns, which enables the join to be done similarly to the MERGE phase of a SORT routine. Relatedly, a table can be ordered by sorting, or reached through an index to provide access in join column sequence. Local predicates on each table are applied before a sort, as illustrated in FIG. 3. Thus, the Table 20 is built from the Table 12 by sorting the manager 32 Davis rows in order by employee number magnitude. Similarly, a sorted Table 22 is derived from the skill Table 10 by selecting skill table records having "test" in their skill fields and ordering those records by employee number. Once join column order is imposed on the tables, the join is done very efficiently using the join column sorted structures. Thus, the join begins by scanning the join column of the Table 22 with the employee number 53, corresponding to the first entry in the derived Table 20. When a match is encountered, a join table row is assembled from the matching rows and entered into the joined Table 14. Following this, the next record in the Table 20 is obtained and the scan of the join column of the Table 22 is picked up from where the last scan stopped. In this manner, the join columns of the sorted Tables 20 and 22 are each scanned only once in effecting the join of the tables. In comparison, the nested loop join procedure requires that the inner table join column be scanned entirely each time an outer table row is found satisfying the outer table local predicate or predicates.
The nested loop join technique makes efficient use of an index on the join column of the inner table. The nested loop join technique is good when the join column values passed to the inner table are in sequence and the join column index of the inner table is clustered, and when the number of rows retrieved in the inner table by finding matching values in the index is small. As is known, an index is clustered when the rows of the table are, for the most part, stored in the same physical sequence as the sequence of key values.
The drawbacks of the nested loop join are manifold. First, for each row in the outer table, the join column value of the outer table is passed as the key to the index of the inner table. When a match is found to the index key, the join procedure is suspended and an I/O procedure is dispatched to fetch the data page corresponding to the index key if the data page is not already in the memory of the system. Suspension of the join process to dispatch the I/O process makes the two "synchronous". Synchronous I/O is undertaken even for a clustered index access on the inner table, because the join column values of the outer table may not be in sequence, or there may be a large gap between outer table join column values. As is known, a synchronous I/O process takes longer than an asynchronous I/O process which accumulates page numbers and fetches them in bulk or which prefetches pages.
The second drawback of the nested loop join arises when the inner table is accessed through an index tree. In this case, the join column value of each row of the inner table is searched through the index tree starting from the root, and continuing to the leaf page for the matching key value. This mode of access is referred to as a random "index probe". When the join column values of the outer table are far apart, the index probe is efficient. When the join column values are spaced closely, index scan through leaf pages is more efficient. However, in this case, the nested loop join does not make use of the information gained from the previous row processing.
Last, in the nested loop join, rows with duplicate join column values in the outer table may be accessed randomly, as when access is keyed on the field containing the local predicate of the outer table. Therefore, duplicate rows in the outer table require the same amount of CPU processing time as that for the first of the duplicate rows and may require fetching the data page of the inner table again, if it is not in memory.
The disadvantage of the sort/merge join lies in its failure to make effective use of the join column index for filtering. In this technique, rows that do not participate in the join result may also be sorted resulting in spending unnecessary CPU and I/O time in accessing rows that will be filtered out by the join predicates later on.
The inventors have determined that the performance differential between nested loop and sort/merge joins can vary by an order of magnitude or more. The implication is that the cost of incorrectly choosing a join method is very high. Yet, today's data base systems frequently choose the join method which may be significantly more expensive at execution time, due to insufficient or incorrect statistics or correlated column values across tables.
There is a need for an alternative to the two primary prior art join methods which will circumvent their disadvantages, while at the same time maintaining advantages of both methods.