1. Field of the Invention
This invention relates in general to database management systems performed by computers, and in particular, to the optimization of SQL queries in a relational database management system by exploiting uniqueness properties using a 1-tuple condition.
2. Description of Related Art
Computer systems incorporating Relational DataBase Management System (RDBMS) software using a Structured Query Language (SQL) interface are well known in the art. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American Nationals Standard Organization (ANSI) and the International Standards Organization (ISO).
In RDBMS software, all data is externally structured into tables. The SQL interface allows users to formulate relational operations on the tables either interactively, in batch files, or embedded in host languages such as C, COBOL, etc. Operators are provided in SQL that allow the user to manipulate the data, wherein each operator operates on either one or two tables and produces a new table as a result. The power of SQL lies on its ability to link information from multiple tables or views together to perform complex sets of procedures with a single statement.
In the SQL standard, a SELECT statement is used to retrieve data and generally comprises the format: "SELECT &lt;clause&gt; FROM &lt;clause&gt; WHERE &lt;clause&gt; GROUP BY &lt;clause&gt; HAVING &lt;clause&gt; ORDER BY &lt;clause&gt;." The clauses generally must follow this sequence, but only the SELECT and FROM clauses are required. The result of a SELECT statement is a subset of data retrieved by the RDBMS software from one or more existing tables or views stored in the relational database, wherein the FROM clause tells the RDBMS software the name of the table or view from which data is being selected. The subset of data is treated as a new table, termed the result table, which typically comprises a temporary table. In general, the items specified in the SELECT clause of the SELECT statement determine the columns that will be returned in the result table from the table(s) identified in the FROM clause.
A DISTINCT keyword may be used in the SELECT clause to eliminate what would be duplicate rows in the result table. Although the DISTINCT keyword may only be specified in the SELECT clause, it affects the rows that are returned, but not the columns. The DISTINCT keyword acts upon the intermediate result set, so that each row of the result table is unique. The default ALL keyword has the opposite effect in that it returns all rows, including duplicates.
The WHERE clause determines which rows should be returned in the result table. Generally, the WHERE clause contains a search condition that must be satisfied by each row returned in the result table. The rows that meet the search condition form an intermediate set, which is then processed further according to specifications in the SELECT clause. The search condition typically comprises one or more predicates, each of which specify a comparison between two values comprising columns, constants or correlated values. Multiple predicates in the WHERE clause are themselves typically connected by Boolean operators.
A join operation makes it possible to combine tables or views by combining rows from one table or view to another table or view. The rows, or portions of rows, from the different tables or views are concatenated horizontally. The join operation is implied by naming more than one table or view in the FROM clause of a SELECT statement. Although not required, join operations normally include a WHERE clause that identifies the columns through which the rows can be combined. The WHERE clause may also include a predicate comprising one or more conditional operators that are used to select the rows to be joined.
In a SELECT-FROM-WHERE query, identifying situations where at most one tuple is retrieved by the query (termed a "1-tuple condition") permits important query transformations and optimizations. Certain instances of the identification of 1-tuple conditions have been performed in the prior art, as disclosed in the following publications, which are incorporated by reference herein:
1. Hamid Pirahesh, Joseph Hellerstein, and Waqar Hasan, "Extensible/Rule Based Query Rewrite Optimization in STARBURST," Proceedings of ACM SIGMOD '92 International Conference on Management of Data, San Diego, Calif., 1992, (hereinafter referred to as "[PHH92]"); PA1 2. Hugh Darwen, "The Role of Functional Dependence in Query Decomposition," Relational Database Writings 1989-1991, Chapter 10, pp. 133-154, 1992, (hereinafter referred to as "[DAR92]"); and PA1 3. G. N. Paulley and Per-Ake Larson, "Exploiting Uniqueness in Query Optimization," Proceedings of IEEE '94 International Conference on Data Engineering, Houston, Tex., 1994, pp. 68-79, (hereinafter referred to as "[PL94]").
However, there is a need in the art for improved methods of identifying 1-tuple conditions and of optimizing SQL queries by exploiting the identified 1-tuple conditions.