The performance of applications running against database systems, such as enterprise database systems, may depend on the database design chosen. A database configuration, as used herein, is defined as one or more tables, one or more indices, one or more views, or any combination thereof. To explore potential database configuration designs, typical data systems have incorporated application program interfaces (APIs) that allow “what-if” analysis, which take as an input a query Q and a database configuration C, and return the optimizer-estimated cost of executing Q if configuration C were present.
Tuning the database design may be defined as receiving a representative query workload WL (i.e., a series of queries Q) and constraints on the configuration space, and outputting a configuration from within the configuration space in which executing the workload WL has the least possible cost (as measured by the optimizer cost model). Cost may be defined as the estimated time to execute the workload. To determine the best configuration within the configuration space, a number of candidate configurations from the configuration space are enumerated and then evaluated using the “what-if” analysis such as in a database tuner.
The representative workload is typically obtained by generating queries with a generator tool or tracing the queries that execute against a production system. To trace the queries, a tracing tool, such as IBM Query Patroler, SQL Server Profiler, ADDM, and the like, may be used over a representative period of time on the production system. The tracing may produce a large number of executed queries or statements in this time. To limit the overhead of repeated optimizer calls to evaluate large numbers of configurations/query combinations, typical data design tools may reduce the number of queries for which the optimizer calls are issued by compressing the workload up-front, i.e., initially selecting a subset of queries and then tuning the database design based only this smaller set of queries to determine an optimal database configuration.