A query statement can be compiled into a query plan consisting of query operators. A query operator can be executed in many different ways, for example full table scans, index scans, nested loop joins, hash joins, and others. A query optimizer is a component of a database management system that attempts to determine the most efficient way to execute a query. The query optimizer determines the most efficient way to execute a SQL statement after considering many factors related to the objects referenced and the conditions specified in the query. The determination is a useful step in the processing of any query statement and can greatly affect execution time.
The query optimizer compares the available query plans for a target input query and estimates which plan will be the most efficient in practice. One type of query optimizer operates on a cost basis and assigns an estimated cost to each possible query plan, for example selecting the plan with the smallest cost. Costs can be used to estimate the runtime cost of evaluating the query in terms of factors such as the number of I/O operations required, processor load requirements, and other factors which can be set forth in a data structure. The set of available query plans that are examined is formed by examining the possible combinations of different database operators (algorithm implementations), such as index scan and sequential scan, and join algorithms including sort-merge join, hash join, nested loops, and others. A search space can become very large according to complexity of the query.
Progress estimation during database query processing enables workload management, both human and automatic, to determine the remaining run-time of active requests and modify the overall work schedule accordingly, for example abort an execution, lower or raise a query's priority, focus on an alternative activity, and the like. In addition, accurate progress estimation permits early detection of query execution plans based on inaccurate estimates and thus chosen erroneously.
A task of a query optimizer is to determine how much progress a database query has made. In simple systems, for example an automatic teller machine that handles an automatic teller type of query such as an account information request or transaction on the account, queries are small but frequent, and optimization is simple. The problem is more difficult with a large data warehouses because a database query is much more complex, such as “for our top 100 customers, how many purchased products from three or more product lines.” Such complex queries combine data from very large tables and predict the volume of data that involved at any stage of answering the query. Practically, a query may be needed to get any of the estimates. Because the queries are so complex and the amount of data processed at any stage is uncertain, determining the time for running a query, the amount of work remaining for the query, and whether the query is making progress or stuck are very difficult.