Database servers commonly receive commands that require the performance of a class of operations known as “join operations.” A join operation is used to combine multiple tables, so data from those tables may be selected in a query. A query that joins two tables specifies how the tables are joined using one or more join criteria. Generally speaking, a join operation comprises merging each row in a first table with each row of a second table that matches the join criteria. For example, a query may specify that rows in the first table only join with rows in a second table when rows in the first table have a value in a particular column that matches the value in a particular column of rows in the second table. The corresponding columns that contain the values that determine which rows of the first table join with which rows of the second table are referred to as the “join keys.”
Database servers may rely on an algorithm known as a “hash join” to efficiently perform join operations. The “hash join” operation typically comprises two phases. In the first phase, known as the “build phase,” the database server generates a hash table by hashing each row of the first table according to a hash function on the join key. In the second phase, known as the “probe phase,” the database server iteratively scans through each row of the second table. For each row in the second table, the database server uses the hash function and the hash table to identify rows in the first table with equivalent join key values. When matching rows are identified, the rows are merged and added to the result set for the join operation.
A hash join operation merges two tables at a time. For a query that specifies a join of more than two tables, a query optimizer may determine an order for joining pairs of tables. Each pair of tables may be merged with their respective join keys, using a hash join, to produce intermediate results. The intermediate results may be merged with other tables or other intermediate results to produce the query results. However, if the tables in the join operation are relatively large, the database server may be unable to fit an intermediate join result in volatile memory. In that situation, a portion or all of the intermediate join result must be written to a storage disk. Reading and writing from a disk takes longer than reading and writing results only from volatile memory.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.