Analytic query processing in data warehouses typically involves large data sets that are updated infrequently in batch-oriented manner. In many cases, it is desirable to execute queries on larger amounts of data (typically spanning a longer time period), but the performance of the query processing system limits the data set size.
Performance can be improved by processing the queries in memory and by increasing the number of servers operating on the queries. However, more servers increases power and power-related infrastructure costs for the servers, putting a bound on the number of servers and thus the size of the data sets.
Alternatively, performance can be improved at lower power by performing in-memory database queries in a cluster of low-power processing units. Each processing unit has low compute power, but a cluster with thousands of processing units has very high performance. While in-memory database query processing in the cluster increases performance, one kind of query operation still poses a challenge. Large table joins do not scale in performance with the size of the cluster. Therefore, it is desirable to seek further improvements in the performance of in-memory processing of large table joins.