1. Field of the Invention
This invention relates in general to computerized relational databases, and in particular, to a method, apparatus and program product for performing queries against a relational database using an aggregate join index.
2. Description of Related Art
Relational DataBase Management Systems (RDBMS) using a Structured Query Language (SQL) interface are well known in the art. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American National Standards Institute (ANSI) and the International Standards Organization (ISO).
In an RDBMS, all data is externally structured into tables. A table in a relational database is two dimensional, consisting of rows and columns. Each column has a name, typically describing the type of data held in that column. As new data is added, more rows are inserted into the table. A user query selects some rows of the table by specifying clauses that qualify the rows to be retrieved based on the values in one or more of the columns.
The SQL interface allows users to formulate relational operations on the tables either interactively, in batch files, or embedded in host languages such as C, COBOL, etc. Operators are provided in SQL that allow the user to manipulate the data, wherein each operator performs functions on one or more tables and produces a new table as a result. The power of SQL lies on its ability to link information from multiple tables or views together to perform complex sets of procedures with a single statement.
The SQL interface allows users to formulate relational operations on the tables. One of the most common SQL queries executed by the RDBMS is the SELECT statement. In the SQL standard, the SELECT statement generally comprises the format: xe2x80x9cSELECT less than clause greater than FROM less than clause greater than WHERE less than clause greater than GROUP BY less than clause greater than HAVING less than clause greater than ORDER BY less than clause greater than .xe2x80x9d The clauses generally must follow this sequence, but only the SELECT and FROM clauses are required.
Generally, the result of a SELECT statement is a subset of data retrieved by the RDBMS from one or more existing tables stored in the relational database, wherein the FROM clause identifies the name of the table or tables from which data is being selected. The subset of data is treated as a new table, termed the result table.
A join operation is usually implied by naming more than one table in the FROM clause of a SELECT statement. A join operation makes it possible to combine tables by combining rows from one table with another table. The rows, or portions of rows, from the different tables are concatenated horizontally. Although not required, join operations normally include a WHERE clause that identifies the columns through which the rows can be combined. The WHERE clause may also include a predicate comprising one or more conditional operators that are used to select the rows to be joined.
Join indexes have been used to improve the performance of join operations in RDBMS""s. A description of join indexes, including a novel join index, can be found in co-pending and commonly-assigned pending U.S. patent application Ser. No. 09/073,113, filed on May 5, 1998, by Chi Kim Hoang, entitled xe2x80x9cA JOIN INDEX FOR RELATIONAL DATABASESxe2x80x9d, which application is incorporated by reference herein.
Techniques have been developed for maximizing performance using join indexes. However, there remains a need in the art for additional optimization techniques through the use of join indexes.
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses methods, apparatus, articles of manufacture, and data structures related to the creation and use of an aggregate join index. The aggregate join index combines one or more commonly-used columns of one or more tables with the results from one or more aggregation expressions pre-computed from one or more columns of the tables.