1. Field of the Invention
The present invention relates generally to query optimization in database management systems. More specifically, the invention relates to the predictable query execution through early materialization.
2. Description of the Related Art
Traditionally, modern Database Management Systems (DBMSs) translate declarative Structured Query Language (SQL) statements into an executable plan prior to the actual execution of the query, hence strictly separating the plan selection and execution phases. A QEP is a functional program that is interpreted by the execution engine to produce the query result. A QEP outlines how the execution engine actually will run a specific query, i.e., how the data will be processed. For example, an important decision might be whether to use indexes and, if there are more indexes, which of these will be used. To achieve optimal performance, the translation phase employs an optimizer, which searches the space of all possible query execution plans (QEPs) for the optimal plan with respect to expected query execution cost. The optimizer, which is part of a compiler, needs to decide in which order to join the existing tables. To choose the optimal QEP, the compiler first compiles all the plans and then the optimizer needs to estimate the cost of each plan. The optimizer determines that a QEP is optimal based on a cost model. A cost model is a mathematical model that determines the execution cost of a query execution plan. Examples of execution costs of a query execution plan are commonly determined by I/O costs, CPU costs, and communication costs. This cost estimation procedure requires the accurate estimation of the intermediate result sizes, referred to as cardinalities, of each processing step. The estimation error in these cardinalities usually increases exponentially in the QEP, as cardinality estimates at one step are computed by multiplication of cardinality estimates at the earlier steps. Intermediate results are the results of a partial execution of a query execution plan. Intermediate results are communicated between the current query execution of the query execution plan and the next query re-optimization of the query execution plan. A partially executed query execution plan is a query execution plan that is executed up to a checkpoint within the query execution plan. A partially executed federated query execution plan is a federated query execution plan that is executed up to a checkpoint within the federated query execution plan.
For complex queries with a high number of tables and predicates, the cardinality estimation may be so far off from the truth, that the optimizer picks a highly sub-optimal, or inefficient, QEP, resulting in unnecessarily long query execution times. Even for only moderately complex queries, this situation occurs frequently when either parameter markers are used or the optimizer's estimation process makes assumptions about the underlying data that do not hold true. Prominent examples are the independence and uniformity assumptions that the optimizer employs, as long as no statistics are available that indicate the contrary. Statistics are numerical data that represent a measurement, or analysis, of the cost metric employed by the optimizer. Statistics represent information about, but are not limited to, table cardinalities, index cardinalities, column cardinalities, operator costs, CPU time, and IO time. For example, in the case of cardinalities, statistics tell the optimizer that one hundred rows of a table were expected to be used by the query execution plan at a certain point in the query execution plan.
Cost-based optimization of federated queries transparently extends optimization across data sources by introducing communication cost. Otherwise, cost-based optimization treats remote tables similar to local tables by introducing a source or server property that describes where the processing of the current plan operator happens. A special operator, called a SHIP operator, describes the point in the QEP where intermediate results are communicated between a remote data source and the federated DBMS. The statistics that are used to estimate cardinalities for remote tables are in most cases obtained from a remote data source, since the gathering of statistics on remote data is very expensive for the federated DBMS. Cardinality is a measure to count a number of rows. A cardinality estimate is an estimation of a number of rows. For example, the cardinality estimate of a CHECK point is the estimation of the number of rows at the CHECK point. The cardinality of an intermediate result is the number of rows in an intermediate result. The variety of relational DBMSs forming the remote data sources employ different optimizers and utilize different forms of statistics. Out of these, the federated DBMS can often exploit only the very basic statistics about the number of rows in a table. Exploiting statistics can be done in a variety of ways; for example, by querying an existing catalog at a remote data source, or by sampling data at a remote data source. The federated DBMS's optimizer is therefore not able to model data distribution and correlation in detail, as this would require distribution and multivariate statistics. The worst cases are federated queries that access non-relational remote data sources or remote DBMSs that do not employ a cost based optimizer. In those cases, there are no statistics on the remote data available at all, and the optimizer is forced to derive its cardinality estimates from default values.
Federated queries, therefore, quite frequently execute using a suboptimal QEP. For local parts of a federated query, the overall model of the data is, in most cases, fairly accurate, and the cardinality misestimates are caused by isolated predicates. For the federated part, however, misestimates can occur at virtually every point in the QEP in some cases. In comparison to purely local queries, the performance degradation, by means of absolute execution time through a suboptimal QEP, is higher for federated queries, because the remote data cannot be accessed natively, but only through a declarative, relational interface, which adds its own overhead.
The inadequacy of traditional query optimization for federated data systems has been recognized for a long time. One current approach provides for sites that are autonomous and can move data fragments independently; hence, no site has good global knowledge of data sizes or layouts. This approach adopts a hierarchical query optimization approach where sites sub-contract out query fragments to other sites via a bidding process that is done just before query execution begins. Other Federated DBMSs also provide calibration functions to help wrappers update their statistics periodically. However, all of these approaches are compile-time or just-before-execution approaches, and are therefore vulnerable to bad cardinality estimates.
More run-time solutions to adaptive query processing have been proposed for traditional, non-federated, DBMSs. Among the earliest was the work on choose-plan operators which pick one among multiple pre-chosen plans during query execution based on the value of run-time parameters. The disadvantage of this approach is that pre-choosing all possibly optimal plans leads to combinatorial explosion, whereas in methods like POP only one optimal plan needs to be maintained at any given time. In another current solution, multiple access methods are run competitively before one is picked. Another DBMS performs star-joins by first computing the intermediate results of all dimension table accesses, and uses the cardinality of these intermediate results to decide the join method for the star-join.
Mid-query re-optimization is a class of techniques in the prior art for changing a QEP during query execution. One current technique provides for re-optimizing after hash join operators by materializing their result, rewriting the SQL query to use this result, and invoking the optimizer again. Materializing means storing the results that are returned by a materialization point into temporary or permanent objects. A materialization point is a point during query execution at which results up to this point have to be processed entirely before query execution can continue. As an example, assume there is a query to join three pieces of information, a customer's account, a customer's transactions, and a customer's address, together. A partial execution of the query execution plan results in only the customer's account and address being joined when execution is stopped. This result, the joined account and address information, is saved as a temporary or permanent object, such as a table. A checkpoint is a point during query execution at which the processed amount of data is compared to the estimates made for query compilation. Yet another current technique called Query Scrambling also re-optimizes queries, but it focuses on handling delayed sources as opposed to incorrect cardinalities. Progressive Optimization (POP) is another instance of this family which monitors intermediate result cardinalities during execution and changes the QEP at well-defined checkpoints. Compared to the other mid-query re-optimization techniques, POP re-optimizes at a much larger number of points during query execution, such as, for example, above federated SHIP operators. POP is also more careful about cost-based reuse of intermediate results. However, POP has the drawback that it greedily re-optimizes as soon as one check-condition has been violated, lacking a global picture of the overall estimation errors in the query execution plan, which may lead to oscillations in the query execution plan and frequent re-optimizations, if many estimates in the query execution plan are incorrect.
LEO is another current technique for addressing cardinality misestimations, which uses query feedback to optimize future queries based on cardinality estimation errors observed during previous query executions.
A completely different approach to adaptive query processing is to view query processing as tuple routing, and to optimize routing of each tuple separately. In Telegraph, a separate Eddy operator is used to continually adapt the tuple routing among other operators. This mechanism is powerful and can be used to adapt join orders, to access paths, and to join algorithms, especially for wide area and Internet data sources. However, per-tuple routing does impose an overhead which can lead to performance regression when the initial plan does not change. Moreover, currently proposed Eddy routing policies are greedy policies; these are fine for certain interactive and continuous processing metrics, but it is not clear if any policy simpler than regular dynamic programming optimization will work for completion time or total work metrics.
There is, accordingly, a need for an improved method of predictable query execution in the presence of multiple estimation errors, which is a problem that very frequently occurs in federated databases where statistics about remote data sources are often imprecise. Exemplary embodiments of the present invention extend the POP technique to deal with multiple estimation errors, by performing a pre-computation of sub-plans of a query, whose statistics are deemed to be unreliable. The query is optimized, reusing the pre-computed results, only after reliable statistics have been established by executing sub-plans of the query. A sub-plan is part of a query that may be computed before any other part of the query is executed, and the intermediate results of the sub-plan do not depend on any other part of the query that has not already been executed.