When evaluating analytic queries in a distributed multi-node system, the need to re-distribute or re-partition the data often arises. For example, in the context of a database system, analytic queries that require join and aggregation operations on different keys will benefit from a re-partitioning of the data to optimally process each operation on the different keys. Additionally, if a data distribution for a particular operator becomes heavily skewed towards certain nodes, then performance can be improved by re-partitioning the data to re-balance the workload across the multi-node system.
While the re-partitioning allows the data to be more efficiently processed in the distributed multi-node system, the re-partitioning itself adds processing overhead since significant amounts of data needs to be exchanged between nodes in a many-to-many fashion. The complexity of the re-partitioning also increases as the number of nodes increases. A non-blocking, high bandwidth interconnect such as InfiniBand can be used to accelerate the re-partitioning. However, even with the use of an appropriate high-speed interconnect, the re-partitioning may still comprise 50-60% of the overall query execution time.
Based on the foregoing, there is a need for a method to optimize data re-partitioning in a distributed multi-node system.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.