Present invention embodiments relate to partitioning and de-partitioning hash tables or other data structures for a database hash join operation, and more specifically, to adjusting a quantity of partitions for hash join operations based on an amount of available memory during runtime.
Typically, a number of hash partitions for a hash join operation is increased when memory is constrained. The number of hash partitions is chosen such that each hash partition would fit in available memory. Repartitioning may be implemented in a number of ways. For example, each hash partition that has data that is too large to fit into available memory may be divided into a number of subpartitions. Outer table rows are joined with each subpartition hash table by loading the outer table rows into the available memory. When using this approach, a same subpartition hash table may be loaded into memory many times, or the outer table rows may be scanned multiple times.
An alternative approach involves switching inner and outer leg roles while performing a join against a large hash partition that cannot fit into available memory. Thus, an original inner leg is treated as an outer leg, and an original outer leg is treated as an inner leg. In this approach, a subset of rows of the original outer legs are hashed into hash tables and tuples of the original inner leg are joined with the hash tables
The above-mentioned approaches can be very complex and expensive due to the outer leg being much larger than the inner leg in a hash join operation.