1. Field of the Invention
This invention relates in general to databases management systems performed by computers, and in particular, to the optimization of SQL queries in a relational database management system using hash star join operations.
2. Description of Related Art
Relational DataBase Management Systems (RDBMS) using a Structured Query Language (SQL) interface are well known in the art. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American Nationals Standard Organization (ANSI) and the International Standards Organization (ISO).
In RDBMS software, all data is externally structured into tables. The SQL interface allows users to formulate relational operations on the tables either interactively, in batch files, or embedded in host languages such as C, COBOL, etc. Operators are provided in SQL that allow the user to manipulate the data, wherein each operator performs functions on one or more tables and produces a new table as a result. The power of SQL lies on its ability to link information from multiple tables or views together to perform complex sets of procedures with a single statement.
A table in a relational database system is two dimensional, consisting of rows and columns. Each column has a name, typically describing the type of data held in that column. As new data is added, more rows are inserted into the table. A user query selects some rows of the table by specifying clauses that qualify the rows to be retrieved based on the values in one or more of the columns.
One of the most important operations in the execution of SQL queries is the join of two or more tables. A user can specify selection criteria from more than one table by specifying how to join the tables. This is normally done by a conditional operator on the columns of the tables.
However, join operations can be quite costly in terms of performance. In joins that involve two tables, each row of the first table might be joined to many rows of the second table. Joins that involve more than two tables are usually resolved by joining two tables at a time. In this case, the order that the tables are joined is extremely important.
Although techniques have been developed for optimizing SQL query expressions involving joins, there is still a need in the art for additional optimization techniques for join operations.