The use of a Relational Database Management System (RDBMS) is well known in the art. Relational databases of the relational database management system are organized into tables that are comprised of rows and columns of data, wherein the rows are tuples and the columns are attributes. The tables are typically stored on direct access storage devices (DASD), such as magnetic or optical disk drives, for semi-permanent storage.
The use of a structured query language (SQL) interface to access data in the relational database management system is also well known in the art. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American National Standards Institute (ANSI) and the International Standards Organization (ISO). The SQL interface allows users to formulate relational operations on the tables either interactively, in batch files, or embedded in host languages such as C and COBOL.
The definitions for SQL provide that a RDBMS responds to a particular query with a particular set of data given a specified database content, but the method that the RDBMS uses to actually find the required information in the tables is left up to the RDBMS. Typically, more than one method can be used by the RDBMS to access the required data. Each of these methods is a query execution plan (QEP).
The query execution plan is a compiled run-time structure used for executing the SQL statement. The query execution plan is the path the RDBMS uses to get to the data requested by the SQL statements. For example, an SQL statement may search an entire table space or the SQL statement may use an index. The query execution plan indicates how well an SQL statement performs. The data associated with the query execution plan is stored in a catalogue or, optionally, in a plan table.
During bind time, the RDBMS software uses various data, including statistics in a RDBMS catalogue, to determine which query execution plan to use. A utility updates the RDBMS catalogue with statistics on table spaces, indexes, tables, and columns. Additionally, when an SQL statement is processed during a bind phase, a query execution plan is determined for the SQL statement.
Typically, when there are two or more query execution plans available for selection, a query execution is selected based on a detailed analysis of the execution costs of each alternative query execution plan. The RDBMS selects an optimal query execution plan to minimize the computing time or resources used and, therefore, minimize the cost of performing the query.
One approach to further minimize computing time or resources used in executing a query utilizes a materialized query table (MQT). Materialized query tables are tables whose contents are defined by a query. In the current state of the art, a materialized query table is used to speed up the execution of another query by effectively rewriting the original query into a semantically equivalent query that references the materialized query table. The rewrite can be performed directly by a user or a tool writing the query. When performed directly by a user, the rewrite of the query using a materialized query table is referenced as “direct MQT substitution”. When performed automatically by the query optimizer, the rewrite of the query using a materialized query table is referenced as “MQT routing”.
For example, consider a federated database configuration in which a data warehouse managed by one RDBMS (RDBMS1) is connected to a data mart that is managed by another RDBMS (RDMBS 2). The tables of RDBMS2 are derived from certain tables of RDBMS1 and are represented in RDBMS2 using materialized query tables of the RDBMS2. For example, the RDBMS2 table CA-Cities comprising information about California cities is derived from the table of the RDBMS1 that contains information about US-Cities, and is defined as follows:                CREATE TABLE RDBMS2.CA-Cities as (SELECT * FROM RDBMS1.US-Cities A WHERE A.state=‘CA’)        
Queries of RDBMS2 can be written directly against the local materialized query tables. Although this technology has proven to be useful, it would be desirable to present additional improvements. On occasion, a query of RDBMS2 is required to drill down to the server of RDBMS1 to obtain more detailed information. Such a query is the following exemplary query that requests detailed sales information for California cities in a specific zip code.
SELECT *
FROM RDBMS2.CA-Cities a, RDBMS1.Sales b
WHERE a.ZipCode=95120 AND a.citycode=b.citycode.
In the current state of the art, a query optimizer typically considers only execution plans that perform the join by moving one of the tables. For example, the query optimizer may move the entire Sales table from the RDBMS1 to the RDBMS2. Alternatively, the query optimizer may consider performing a separate probe of the Sales table for each qualifying city code. Both of these execution plans can be inefficient due to excessive movement of data or communication between one data source and another.
When a user specifies a materialized query table in a query, a conventional query optimizer is constrained to use those specified query tables even though the resulting query performance is not optimum. A conventional query optimizer does not consider the inverse of the routing process of the materialized query table whereby a query that uses direct MQT substitution is automatically rewritten by the optimizer into a semantically equivalent query that does not reference the materialized query table. This process is referenced as “reverse MQT routing” (further referenced herein as reverse routing).
Consider the following equivalent query that is arrived at using a reverse routing optimization:
SELECT
FROM RDBMS1.US-Cities a, RDBMS1.Sales b
WHERE A.state=‘CA’ a AND a.ZipCode=95120 AND a.citycode=b.citycode.
Execution plans for this equivalent query are likely much more efficient than alternative plans for the original query as the tables participating in the join reside on the same data source, RDBMS1.
What is therefore needed is a system, a computer program product, and an associated method for reverse routing a materialized query tables in a database. The need for such a solution has heretofore remained unsatisfied.