Database servers commonly receive commands that require the performance of a class of operations known as “join operations.” A join operation is used to combine multiple tables, so that data from those tables may be selected in a query. A query that joins two tables may specify how the tables are joined using join criteria. A join operation comprises merging each row in a first table with each row of a second table that matches the join criteria. For example, a query may specify that rows in the first table only join with rows in a second table when rows in the first table have a value in a particular column that equals the value in a particular column of rows in the second table. The corresponding columns that contain the values that determine which rows of the first table join with which rows of the second table are referred to as the “join keys.”
Database servers may rely on an algorithm known as a “hash join” to efficiently perform join operations. The “hash join” operation typically comprises two phases. In the first phase, known as the “build phase,” the database server generates a hash table by hashing each row of the build table according to a hash function on the join key. In the second phase, known as the “probe phase,” the database server iteratively scans through each row of the second table, referred to herein as the probe table. For each row in the probe table, the database server uses the hash function and the hash table to identify rows in the build table with equal join key values. When matching rows with equal join key values are identified, the rows are merged and added to the result set for the join operation.
Hash joins, where large hash tables are generated, may suffer from high latency when accessing the hash table due to cache misses. For example, hash joins that generate large hash tables may result in at least part of the hash table being stored in DRAM-based main-memory instead of being cached in SRAM-based CPU cache. DRAM refers to dynamic random-access memory and SRAM refers to static random-access memory. As a result, hash lookups for data not stored within the cache results in cache misses. Additionally, modern hardware architectures with multiple processing cores, sophisticated caching architectures, and custom instructions may become inefficient due to cache misses and branch mispredictions. Branching in code may be defined as instructions the execution of which are dependent on data values. One example of code branching is if/else statements where, depending on data values, the CPU may execute one block of code instead of another. CPU architectures have been developed to speculate on branching and/or data values, based upon traces of executions, in order to predict which branch should be executed. For example, the CPU may expect that 90% of the time code block A is executed over code block B. In this case the CPU may execute code block A for the data based upon the expectation that 90% of the time the data values will indicate that code block A should be executed. After executing code block A, the CPU may evaluate the data values to determine whether the branch prediction was correct. If, however, executing code block A was the incorrect prediction, then the CPU will have to revert executions performed in code block A, which may result in flushing several CPU cycles. Branch mispredictions, such as the one described, may result in the CPU flushing several CPU cycles of incorrectly predicted executions. Branch mispredictions on modern hardware architectures may result in flushing entire queues of data in order to correct branch prediction errors, which may result in stalls in execution of code. Therefore, there is a need for efficient hash join algorithms that address the demands of large hash tables and the complexity of modern hardware using an architecture-conscious design that reduces instances of cache misses and branch mispredictions.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.