Various embodiments of this disclosure relate to database systems and, more particularly, to high-performance hash joins using memory with internal parallelism.
A join is a common and important relational database operation. Given two database tables and a selected column that appears in both, for each row in one table, a join seeks to extract the value in the selected column and to identify one or more rows in the second table that have the same value in this selected column. The result of the join is a table having rows that are a combination of those from the first and second table, each row having values joined across the tables due to a like value in the selected column. Not all columns from the corresponding rows need be included in the resulting table, depending on the details of the join instruction initiating the procedure.
The selected column on which the join is performed is referred to as the join key. The first table is referred to herein as the fact table, and the second table is referred to herein as the dimension table.
A common approach to implementing joins is through a technique known as a hash join. A hash join procedure generally includes two phases: (a) build the hash table (i.e., the build phase); and then (b) probe the hash table (i.e., the probe phase). In the build phase, a hash table is constructed from hashed join-key values from the dimension table. Within a row of the hash table corresponding to a row of the dimension table, a pointer to the corresponding row or desired data from the corresponding row may be included in the hash-table entry and used to assist in the join. In the probe phase, values of the join key are read from the various rows of the fact table and then hashed, and the corresponding hash-table entry is retrieved from memory and examined to see if it contains the join-key value from the fact table. If there is a hit in the hash table, a join is performed with selected data from the appropriate rows in the fact table and the dimension table.
Efficient implementation of join algorithms is critical for the performance of many database queries. For hash joins, as for other join algorithms, overall performance is often affected by the potentially large amount of data in the two tables. If the two tables are stored in a columnar format, storage of join-key values for successive rows is generally adjacent in memory. Thus, reading entries from the dimension table during the build phase and from the fact table during the probe phase can operate at speeds approaching peak memory bandwidth limits, given mechanisms such as hardware prefetch, which enable efficient sequential accesses to external dynamic random-access memory (DRAM).
Accesses into the hash table during the probe phase, however, are data-dependent and random, due to the use of hash functions. Double-data-rate (DDR) DRAM cannot efficiently support large numbers of data-dependent random read requests, because of the limited number of internal responders, or banks, that can extract data from the DRAM cells in parallel. Moreover, because the read requests are data-dependent and random and the hash table can be large, even caches as large as 100 MB provide little or no benefit unless extensive additional work is done to partition the hash table. Thus, the data-dependent random accesses to the hash table dominate in the performance of hash joins.
Recently, DRAM has become available in a form that provides a dramatic increase, by one to two orders of magnitude, in the number of responders that can operate in parallel inside the DRAM. One example is the so-called “hybrid memory cube” (HMC), described in a specification from the Hybrid Memory Cube Consortium. HMCs are expected to be available with 256 independent responders in a 4 GB DRAM structure. The present disclosure refers to DRAM configurations that incorporate relatively large numbers of independent responders as “DRAM with extensive internal parallelism,” or “DRAM with EIP,” and the term “DRAM” herein can refer to various types of DRAM, including, for example, DRAM with EIP. For example, and not by way of limitation, DRAM with EIP may have the following characteristics: (1) a large number of responders, such as two hundred or more but fewer may also be acceptable, that can return data in parallel, with differential latencies that are small compared with the bulk latency for a single isolated read request; and (2) high aggregate interface bandwidth (e.g., for the Micron HMC, this can exceed 200 GB/s aggregated over several physical links).