A database is a collection of stored data that is logically related and that is accessible by one or more users or applications. A popular type of database is the relational database management system (RDBMS), which includes relational tables, also referred to as relations, made up of rows and columns (also referred to as tuples and attributes). Each row represents an occurrence of an entity defined by a table, with an entity being a person, place, thing, or other object about which the table contains information.
One of the goals of a database management system is to optimize the performance of queries for access and manipulation of data stored in the database. Given a target environment, an optimal query plan is selected, with the optimal query plan being the one with the lowest cost, e.g., response time, CPU processing, I/O processing, network processing, etc., as determined by an optimizer. The response time is the amount of time it takes to complete the execution of a query on a given system. In this context, a “workload” is a set of requests, which may include queries or utilities, such as loads, that have some common characteristic(s), such as application, source of request, type of query, priority, response time goals, etc.
Contemporary research involves improving the “robustness” of a query optimizer, where robustness is characterized by query execution plans whose performance are predictable and consistent. From the standpoint of system management, the predictability of query plan performance is very important. When the actual cost of a query plan deviates significantly from the optimizer's estimate, it becomes more difficult to tune and manage the performance of a large system.
Although query robustness is a desirable goal, there is often a tradeoff between query robustness and the goal of achieving the best possible query performance. Optimizers frequently have a choice between a conservative query execution plan that will perform reasonably well in most situations, and an aggressive plan that performs very well if the cost estimates and runtime assumptions are accurate. However, if the cost estimates and runtime assumptions are not accurate, the performance of processing an aggressive plan may be significantly degraded. The decision of whether to pursue an aggressive strategy is strongly influenced by a “confidence” parameter that the optimizer assigns to its estimates which, in turn, are based on various factors, such as the presence of statistics. Such confidence parameters are typically included in the plan information made available to system management tools which may alter rule enforcement logic in view of the confidence parameters. For example, an optimizer's upper bound costs may be used in place of lower bound costs when the associated confidence level is below a particular level.
Although it is useful to factor in the optimizer's assigned confidence levels when regulating workload performance, contemporary approaches are inherently passive and reactive with regards to controlling the desired level of optimizer aggressiveness. As noted above, there is often a tradeoff between predictability and performance, and it would be desirable to set the appropriate balance between the two competing goals by accounting for specific requirements of the individual application or workload. However, contemporary mechanisms do not account for the specific requirements of the individual SQL request or workload when deciding a level of optimizer aggressiveness. In particular, a Service Level Goal (SLG) may be defined for a particular query or workload, e.g., a group of queries, that specifies a desired performance characteristic for the query or workload. For instance, an SLG may specify a desired maximum processing duration of a query. SLGs are typically used by automated workload management systems in deciding how to assign and regulate critical system resources, such as central processing unit (CPU) allocations for processing of the query. Existing automated actions that take into account SLGs are usually limited to the scheduling of resources during the execution of a query and do not influence the query optimization process.
Disadvantageously, no mechanisms are currently available that provide for proactively controlling an optimizer's pursuit of robustness for individual workloads or queries in a manner that accounts for a query or workload SLG.