The present invention relates to computer database management systems, and more particularly to the optimization and transformation of database queries that include OUTER JOIN operations.
Computer systems incorporating Relational DataBase Management System (RDBMS) software using a Structured Query Language (SQL) interface are well known in the art. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American Nationals Standard Organization (ANSI) and the International Standards Organization (ISO).
In RDBMS software, all data is externally structured into tables. The SQL interface allows users to formulate relational operations on the tables either interactively, in batch files, or embedded in host languages such as C, COBOL, etc. Operators are provided in SQL that allow the user to manipulate the data, wherein each operator operates on either one or two tables and produces a new table as a result. The power of SQL lies on its ability to link information from multiple tables or views together to perform complex sets of procedures with a single statement.
SQL supports OUTER JOIN and INNER JOIN operations. The semantics of the OUTER JOIN operation are outlined as follows. An OUTER JOIN involves a join of two tables, which are referred to as the left table and the right table. An OUTER JOIN is a modification of an INNER JOIN in that it preserves all information from one or both of the input relations.
An outer join can be further categorized into left, right (single-sided), or full (two-sided) OUTER JOIN, depending on which side needs to be preserved. These operations have two operands, the left and right tables. If tuples of both tables are preserved, then the operation is called FULL OUTER JOIN. If the tuples of only one table are preserved, e.g., the left table or the right table, then the result is respectively a LEFT OUTER JOIN or a RIGHT OUTER JOIN. In the output or answer set, for the non-matching tuples of a preserved table, NULL values are assigned to the columns of the other table.
A SQL query that is submitted to the RDBMS is analyzed by a query optimizer. Based on the analysis, the query optimizer generates an execution plan optimized for efficient execution. The optimized execution plan may be based on a rewrite of the query. One technique for SQL query optimization involves simplifying queries based on the nature of null-intolerant predicates.