1. Field
Disclosed embodiments generally relate to execution plans in relational databases.
2. Background Discussion
In database query processing, join operations are regularly used in order to combine fields from tables, using values common to each, in order to create a new set of data. The cost for a join operation is often expensive, particularly for joins of large tables. When processing query operations on a database, a database management system will seek to execute the queries as efficiently as possible. Determining an efficient execution approach, typically referred to as a query execution plan, is handled by a query optimizer.
Some conventional query optimizers attempt to manage the cost of join operations by mitigating the size of the working set. For example, some query optimizers use a Bloom filter, which is a space-efficient probabilistic data structure to support membership queries and to test whether an element is a member of a given set. The Bloom filter allows pruning of the rows of the table that fail the join criteria during a table scan phase, thereby generally avoid excessive input/output (I/O) and temporary space overhead. The Bloom filter can therefore improve performance in the join processing and optimization by reducing the number of rows (i.e., the cardinality) of the scan result before execution of the join operation. Since the efficiency from the reduction of the scan result cardinality can outweigh the extra processing costs, the query optimizer may integrate the Bloom filter into the query execution in order to achieve overall efficiencies.
Conventional systems typically use a bottom-up approach to estimate the cost of applying the Bloom filter to a multiple join plan. In this approach, the effect of the Bloom filter is first estimated at the scan operator level based on a partial statistics of the join, and the estimation at the scan level is used to further estimate the cost of the join in the upper level above the scan level, and so forth. However, this bottom-up approach may lead to great inaccuracy. The problem is exacerbated due to the possibility that a small estimation error at the scan level may be accumulated and amplified through multiple join levels. As a result, conventional systems generally fail to provide an accurate cost estimation model for the Bloom filter.
Accordingly, conventional systems do not process queries of join operations with ideal performance and accuracy, especially when the operations involve multi-level joins on large tables.
The accompanying drawings, which are incorporated herein and form part of the specification, illustrate the embodiments of the present invention and, together with the description, further serve to explain the principles of embodiments and to enable a person skilled in the relevant art(s) to make and use such embodiments.