An optimizer is located in or is part of a compiler of a database. Generally speaking, the optimizer compiles a received query into a plan, called a plan of execution or “query plan.” The optimizer then examines several different potential plans of execution for each query. For each of the examined plans the optimizer generates a cost based upon a costing model internal to the optimizer, so that it can choose the plan with the least amount of cost as the means for executing the query.
The execution cost is typically determined in a very analytical way. Generally speaking, the optimizer divides a database query into its operations and sub-operations, and determines from a cost model how many resources will be used to execute each operation and sub-operation. The optimizer then determines a sum of these incremental execution costs to determine the overall execution cost for a query. The execution cost is typically a unitless number which represents a cost in terms of resource utilization. Generally, the optimizer chooses to execute a query plan with the lowest calculated execution cost. Though not guaranteed, one goal of this selection method is that an query plan with a lower execution cost will likely execute the database query more quickly than an query plan with a higher execution cost.
User's typically desire some estimate of the runtime that it will take for a query to execute, not just a goal that one query plan will likely execute more quickly than another. There are a variety of reasons for this. For example, by having an estimate of the query runtime, a user (or a system) can more effectively manage the workload of a database. That is, decisions can be made such as: how many queries to perform at once; what order the queries should be performed in; and how to load balance the database system. Additionally, if a user knows that a query will take a long time, he may take an action, such as: reformatting the query, not executing the query, or deciding to do something with his time while the query is executing. Thus, the more accurate an estimate is, the better it can be utilized for planning and managing purposes
While the execution cost is a fairly accurate predictor of relative execution cost (in terms of resource utilization costs) for choosing among a variety of query plans, it is not an accurate mechanism for estimating the runtime that a particular query is likely to take to execute. Generally speaking, execution runtimes estimated from an execution cost are not very reliable and often vary widely when compared to actual execution runtimes. Moreover, the more loaded a database (i.e., the more queries it is simultaneously executing), the longer a query will typically take to execute. Wide variability and unpredictably long execution times are especially a concern in database systems such as Enterprise Data Warehouses where wide variations in the loading cause execution runtimes of even identical queries to vary widely. This variability of execution runtimes causes difficulties in management of database workload and/or in user decisions with respect to database queries. Though some products do exist to predict query runtimes, these products are typically specific to a particular database and do not give consideration to the loading of a database when predicting the execution runtime of a query.
The drawings referred to in this description should not be understood as being drawn to scale unless specifically noted.