1. Field of the Invention
This invention relates generally to database management systems and, more particularly, to efficient evaluation of SQL statements processed in relational database management systems.
2. Description of the Related Art
Information is frequently stored in computer processing systems in the form of a relational database. A relational database stores information as a collection of tables having interrelated columns and rows. A relational database management system (RDBMS) provides a user interface to store and retrieve the information and provides a query methodology that permits table operations to be performed on the data. One such RDBMS interface is the Structured Query Language (SQL) interface, which permits users to formulate operations on the data tables either interactively, or through batch file processing, or embedded in host languages such as C, COBOL, or the like.
SQL provides table operations with which users can request database information and form one or more new tables out of the operation results. Data from multiple tables, or views, can be linked to perform complex sets of table operations with a single statement. The table operations are specified in SQL statements called queries. One typical SQL operation in a query is the "SELECT" operation, which retrieves table rows and columns that meet a specified selection parameter. Another operation permitted by SQL is the "JOIN" operation, which concatenates all or part of two or more tables to create a new resulting table. For example, a query might produce a table that contains the names of all supervisory employees who live in a given city, and might do so by specifying a SELECT operation to retrieve employee names and resident cities from one table, and then performing a JOIN of that data after a SELECT operation to retrieve employee names and job titles from another table.
Evaluation of SQL Statements
An SQL query generally includes at least one predicate, which is an SQL expression that can assume a logical value of TRUE, FALSE, or UNKNOWN. A predicate typically either specifies a data range, tests for an existence condition, tests for equivalence, or performs a similar table comparison operation. In an SQL system, a query is received through the SQL interface and is received by an SQL processor that rewrites the query from the input format provided by the user into generally standard SQL language, and also may include optimization processing performed by a query optimizer. An SQL query may be equivalently expressed in many different ways. The query optimizer typically chooses from among alternative combinations of SQL operators to ensure a query that may be more efficiently evaluated. For example, the SQL optimizer typically can determine how to organize intermediate results (intermediate tables) so data operations during evaluation are carried out with maximum efficiency.
To evaluate a query, an SQL execution plan is generated by the SQL processor from the parsed, optimized SQL input. To generate an execution plan, the SQL processor considers the available access paths to the data and considers system statistics on the data to be accessed to select what it considers to be the most efficient access path to evaluate the query and retrieve the results. In considering the available access paths, the processor checks table indexes, sequential read operations needed, and the like to determine how it will retrieve data. The system statistics considered in choosing from available access paths include statistics on the size of tables, the number of distinct values in columns of tables, the network locations of tables, and the like. Consideration of the available access paths and the system statistics yields the single access path determined to be the most efficient, which is selected for the execution plan. In this way, the SQL processor "binds" the access path to the query. Finally, the SQL interface executes the optimized query plan, retrieves the data, and provides the results to the user.