1. Field of the Invention
This invention relates in general to database management systems performed by computers, and in particular, to the optimization of a star join operation in a relational database management system using a bitmap index structure.
2. Description of Related Art
Relational DataBase Management Systems (RDBMS) using a Structured Query Language (SQL) interface are well known in the art. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American National Standards Institute (ANSI) and the International Standards Organization (ISO).
In an RDBMS, all data is externally structured into tables. A table in a relational. database is two dimensional, consisting of rows and columns. Each column has a name, typically describing the type of data held in that column. As new data is added, more rows are inserted into the table. A user query selects some rows of the table by specifying clauses that qualify the rows to be retrieved based on the values in one or more of the columns.
The SQL interface allows users to formulate relational operations on the tables either interactively, in batch files, or embedded in host languages such as C, COBOL, etc. Operators are provided in SQL that allow the user to manipulate the data, wherein each operator performs functions on one or more tables and produces a new table as a result. The power of SQL lies on its ability to link information from multiple tables or views together to perform complex sets of procedures with a single statement.
The SQL interface allows users to formulate relational operations on the tables. One of the most common SQL queries executed by the RDBMS is the SELECT statement. In the SQL standard, the SELECT statement generally comprises the format: xe2x80x9cSELECT  less than clause greater than FROM  less than clause greater than WHERE  less than clause greater than GROUP BY  less than clause greater than HAVING  less than clause greater than ORDER BY  less than clause greater than .xe2x80x9d The clauses generally must follow this sequence, but only the SELECT and FROM clauses are required.
Generally, the result of a SELECT statement is a subset of data retrieved by the RDBMS from one or more existing tables stored in the relational database, wherein the FROM clause identifies the name of the table or tables from which data is being selected. The subset of data is treated as a new table, termed the result table.
A join operation is usually implied by naming more than one table in the FROM clause of a SELECT statement. A join operation makes it possible to combine tables by combining rows from one table with another table. The rows, or portions of rows, from the different tables are concatenated horizontally. Although not required, join operations normally include a WHERE clause that identifies the columns through which the rows can be combined. The WHERE clause may also include a predicate comprising one or more conditional operators that are used to select the rows to be joined.
Star joins involve one or more dimension tables joined to a fact table. Star join operations can also be costly in terms of performance time.
Techniques have been developed for minimizing the time required to perform a star join operation. However, there is still a need in the art for additional optimization techniques for star join operations.
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses a method, apparatus, and article of manufacture for optimizing a star join operation in relational database management systems (RDBMS). A cross-product is generated from a plurality of dimension tables referenced by the star join. The join columns of the cross-product are then hashed to create a hash-row value. Using the hash-row value, a Star Map is probed to determine whether a record that corresponds to the cross-product exists in a fact table, wherein a first portion of the hash-row value is used to select a row of the Star Map and a second portion of the hash-row value is used to select a column of the selected row of the Star Map. The fact table is accessed to perform a join with the cross-product when the selected column of the selected row of the Star Map indicates that the record might exist in the fact table.