Many database customers have dynamic workloads that include a variety of queries, such as light processing queries and processing-intensive business intelligence queries. Typically, customers expect quick responses from the light-processing queries. Customers may tolerate some latency in the business intelligence queries if that latency is commensurate with the complexity of the queries.
However, even a single disproportionately-long running query can be a significant problem because a long-running query can consume large amounts of database resources. Generally, disproportionately-long running queries may result from inefficient query plans, which are produced by the database optimizer. Because of the highly disruptive impact of query plans on the overall system, customers may expect the query optimizer to always generate efficient query plans.
Optimizing queries so that all queries run efficiently may be difficult to accomplish on any platform, even on single processor platforms. Optimization may be even more difficult on systems with massively parallel processors (MPP). In MPP systems, the optimizer has the additional tasks of deciding whether to use one, all, or a subset of the processors to run the query plan.
Classic cost-based optimizers may model the costs of alternative query plans and choose the cheapest plan. The cost-based optimizers may base the modeled cost on estimates of how many rows of data (i.e., cardinality estimates) flow through each operator. This strategy may be effective for simple queries in which compile-time estimates of cardinality match actual values for run-time cardinality. However, this strategy may generate disproportionately long-running query plans when compile-time estimates of cardinality deviate from actual run-time values.