Relational database management systems are well-known in the art. In a relational database, information is structured in a collection of tables in which data values are stored in rows under various column headings. The Structured Query Language ("SQL") allows users to access databases maintained under any number of relational database management systems and has become the standard for relational database access.
Data is retrieved from the relational database by means of a SQL query, such as, in particular, a so-called SQL "SELECT" statement. A simple SQL SELECT statement may be of the form
SELECT specified field(s) PA1 FROM specified table(s) PA1 WERE specified condition(s) is true. PA1 SELECT name PA1 FROM employees PA1 WHERE sal=100
For example, the query
results in a list of the names of those employees earning $100, where "employees" is a table defined to include information about all employees of a particular company.
Other operations may be specified in, or result from, a SQL query. Some examples are as follows. Data from two or more tables may be combined in a "join" operation. "Views" can be derived from one or more so-called "base tables." Aggregates, e.g., such operators as SUM and COUNT, specify operations to be performed on the collection of values in some column of a table. The GROUP BY operator allows for tables to be grouped by any combination of their fields. Finally, SELECT statements may also be nested, thereby forming different types of subqueries.
Since any combination of such SQL operations as those above may be found in one SQL query, a SQL query may become quite complex, and, in fact, this complexity has increased as SQL queries have evolved over time. In particular, simple queries are typically "one block" queries, that is, they can be expressed with one SELECT statement having single FROM, W]HERE, HAVING, and/or GROUPBY clauses. Simple queries have no subqueries or views. In contrast, a complex SQL query is composed of multiple blocks. An example of a complex SQL query is the so-called "decision-support" queries. Organizations have come to base decisions on results from these queries which are o-Ren defined using grouping/aggregation view relations and correlated subqueries (i.e., a subquery which is dependent upon some variable(s) whose value is determined in an "outer" query) .
SQL queries express what results are requested but do not state how the results should be obtained. In other words, the query itself does not tell how the query should be evaluated by the relational data base management system. Rather, a component called the optimizer determines the "plan" or the best method--for example, in terms of I/O and CPU processing costs--of accessing the data to implement the SQL query.
Because of the potential complexity of SQL queries, query optimization, especially with respect to decision-support queries, has become very important. Different approaches to decision-support query optimization include the use of relational algebra and "magic sets rewriting".