In contemporary database management systems, query optimizers are responsible for providing a good execution plan for a given query. In general, the way in which a query optimizer transforms an input query into an execution plan determines the performance of a processing a query. As a result, a query optimizer needs to be thoroughly tested before being implemented in an actual database management system to ensure that it functions correctly.
Many contemporary query optimizers are transformation rule-based optimizers, which use rules to transform input queries into more optimal ones. To test such optimizers, rule testing is needed with respect to “coverage” and “correctness.” Coverage refers to ensuring that a given transformation rule has been exercised during query optimization in several different queries. Correctness refers to verifying that the transformation does not alter the results.
It is thus desirable to have tests cases in the form of SQL queries such that when the queries are optimized, they exercise all rules that need to be tested. Further, it is valuable to test that a set of rules (e.g., generally pairs of rules) are exercised together in a query, because this tests for any issues caused by rule interactions.
Contemporary testing techniques use stochastic methods to randomly generate SQL queries in a trial-and-error approach until a query is found that exercises the desired rule or rule pair, for example. However, this can take many trials to even find a single query that exercises the given rule or rule pair, let alone finding several such queries. For example, consider a transformation rule that pulls up a Group-By operator over a left outer-join; a randomly generated query is not likely to succeed unless by chance it includes a Group-By and a left outer-join in the same query, which make take a large number of trials. Further, such randomly generated queries tend to be rather complex, and thus optimizing the query in each trial can take a large amount of time.