1. Field of the Invention
This invention relates in general to database management systems performed by computers, and in particular, to query transformation for multiple UNION ALL view join queries using join predicates for pruning and distribution.
2. Description of Related Art
Computer systems incorporating Relational DataBase Management System (RDBMS) software using Structured Query Language (SQL) interface are well known in the art. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American National Standards Institute (ANSI) and the International Standards Organization (ISO).
For most RDBMS software, combinations of tables and views are used to access data stored in tables in the database. A view definition includes a query that, if processed, provides a temporary result table based on the results of the query at that point in time.
To enhance maintainability, usability, and performance in database applications with very large data volumes, a UNION ALL view feature has been widely used to partition large tables into smaller tables. A typical partitioning mechanism is based on time. For example, data can be organized into monthly tables, and a UNION ALL view of these monthly tables is used to provide a logical view of all data. Some sophisticated applications have many views with UNION ALL, and join these views in queries.
Existing query optimization techniques will use local predicates in a query to prune useless branches of a UNION ALL view. Moreover, the distribution will generate joins for the remaining branches. These techniques may not be effective for queries with joins of multiple UNION ALL views, for the following reasons.
Although each branch of a UNION ALL view has a local predicate, such as YEAR=2001, the referencing query may not have a local predicate on the view, or it may only have local predicates on other columns. This makes pruning based on local predicates useless, and none of the branches can be pruned.
The distribution transformation will generate joins of branches of the joined views. The number of joins is the product of the number of the branches of the joined views.
Consider the distribution of A*B*C, wherein A, B, C are UNION ALL views of four base tables each, A=(A1+A2+A3+A4), B=(B1+B2+B3+B4), C=(C1+C2+C3+C4), * indicates a join and + indicates a UNION ALL. Performing A*B*C without sub-select pruning due to lack of local predicates will result in:(A1+A2+A3+A4)*(B1+B2+B3+B4)*C1+(A1+A2+A3+A4)*(B1+B2+B3+B4)*C2+(A1+A2+A3+A4)*(B1+B2+B3+B4)*C3+(A1+A2+A3+A4)*(B1+B2+B3+B4)*C4after just distributing A*B into view C. The complete list, after performing distribution into view B and then view A, would contain 64 branches of the UNION ALL view.
When the number of branches of the UNION ALL views gets a little larger, this creates two major problems during the compilation (bind) time. First, it exceeds the query processing limits, the processing cannot continue, or if the limit is not exceeded, it consumes excessive storage at the bind time. Second, bind time is excessively long.
Among the joins generated, many of them are non-overlapping and produce empty results. For example, a branch with YEAR=2001 of view A joining a branch with YEAR=2002 of view B results in an empty result. All the resources at both the bind time and the run time on these empty-result joins are wasted.
Thus, there is a need in the art for a solution for this class of multiple UNION ALL view join queries that solves the above shortcomings. The present invention satisfies these needs.