Of particular interest in today's computing environment are relational database applications. Relational DataBase Management System (RDBMS) software using a Structured Query Language (SQL) interface is well known in the art. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American Nationals Standard Organization (ANSI) and the International Standards Organization (ISO).
In RDBMS software, all data is externally structured into tables. The SQL interface allows users to formulate relational operations on the tables either interactively, in batch files, or embedded in host languages such as C, COBOL, etc. Operators are provided in SQL that allow the user to manipulate the data, wherein each operator operates on either one or two tables and produces a new table as a result. The power of SQL lies in its ability to link information from multiple tables or views together to perform complex sets of procedures with a single statement.
In order to improve the RDBMS performance in evaluating and satisfying queries, the inherent parallelism in multiple CPUs or I/O devices available in a computer system during execution can be exploited. For example, when performing a sequential table scan of a table that is stored across multiple I/O devices, the table scans on the separate I/O devices can be performed at the same time to reduce I/O time by utilizing the concurrency of multiple asynchronous I/O operations on the devices. Parallelism can also be exploited by using multiple CPUs to evaluate the data according to criteria provided by a query, so that total elapsed time can be lowered by overlapping query processing by the multiple CPUs. A more complex parallelism operation involves partitioning the query execution plan among CPUs and executing operations in parallel.
While parallelism can exploit the capabilities of system components, such exploitation may increase the resource utilization to a point that there is not a significant benefit realized. For example, parallel execution of a task, such as a sort, generates sub-tasks or child-tasks that each require workfile usage and virtual storage consumption. Such resource consumption can cause resource contention among the subtasks and can limit resource availability for other tasks also being executed within the same RDBMS. Thus, a trade-off exists between improved response time and resource overhead for parallel task execution.
Accordingly, a need exists for a manner of balancing response time against resource utilization to improve database query execution. The present invention addresses such a need.