Processing queries in a database system typically involves at least two phases: a compilation phase and an execution phase. During the compilation phase, the database system takes the query text that specifies the logical outcome of the query and generates an execution plan that can be executed based on the physical layout of the data.
A query optimizer may generate multiple valid query execution plans, each of which would produce a valid query answer if executed against the database. However, a query optimizer may select only one of the query execution plans for execution. The selection of a query execution plan by the query optimizer may be based on an estimated cost of executing the execution plan relative to the other candidate execution plans. A query optimizer may take into account several factors to generate the estimated cost, such as the number of rows that may be processed during each step of the execution, the operations to perform (e.g., joins, table scans), and the cost of accessing the data according to the specified execution plan.
A technique called a partition-wise join is an effective method for improving the performance of certain join queries in which two tables are joined by a partition key of at least one of the two tables on a multi-node cluster. Examples of partition-wise joins may be found in U.S. Pat. No. 6,609,131 filed Sep. 27, 1999, “PARALLEL PARTITION-WISE JOINS,” filed by Mohamed Zait et al., the entire contents of which is hereby incorporated by reference as if fully set forth herein. The partition-wise join technique improves query execution performance by assigning each pair of corresponding partitions of the joined tables to a respective cluster process and having the cluster process scan the partitions and process the join. The partition-wise join technique can also be applied when the partitioned tables are populated in-memory in a dual format database system. Such systems are described, for example, in U.S. patent application Ser. No. 14/377,179, entitled “Mirroring, In Memory, Data From Disk To Improve Query Performance”, filed Jul. 21, 2014, the contents of which is incorporated herein in its entirety.
However, even when a join query qualifies for an in-memory partition-wise join, it may not be cost-effective to do so if the cluster processes have to read a lot of rows from disk or the buffer cache. Unfortunately, there is no technique for determining when to apply a partition-wise join to a qualifying join query in the situation where the joined tables are populated, at least partially, in-memory.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.