1. Field of the Invention.
This invention relates in general to database management systems performed by computers, and in particular, to the optimization of queries by transparently altering properties of relational tables using materialized views.
2. Description of Related Art.
Computer systems incorporating Relational DataBase Management System (RDBMS) software using a Structured Query Language (SQL) interface are well known in the art. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American Nationals Standard Organization (ANSI) and the International Standards Organization (ISO).
For most RDBMS software, combinations of tables and views are used to access data stored in tables in the database. Indices are often used to improve the performance of retrieving data from tables. However, indices are generally limited to columns from base tables. Thus, indices are not seen as suitable for:
results of aggregations, and
results of joins for commonly used subsets of the data.
A view definition includes a query that, if processed, provides a temporary results table based on the results of the query at that point in time. Using an INSERT statement and an appropriately defined table in the database, the temporary results table can be stored in the database. To refresh this table, the user would need to perform a DELETE from the table and then perform the INSERT again.
Users can directly query against the created table, provided that the users are aware how the results were derived. Generally, the RDBMS software is not aware that such a table is any different from any other table in the database. However, this table cannot be used by an optimizer within the RDBMS software to improve performance, even though the table may contain data that would drastically improve the performance of other queries.
This leads to the notion of summary tables or materialized views as envisioned by the present invention. These tables are similar to the created table described above, except that the definition of the table is based on a xe2x80x9cfull selectxe2x80x9d (much like a view) that is materialized in the table. The columns of the table are based on the elements of the select list of the full select.
In the present invention, with properly defined summary tables, the RDBMS software can be made aware of how the result in the summary table was derived. When an arbitrarily complex query is submitted, an optimizer in the RDBMS software can consider using the summary tables to answer the query, which is a technique that requires performing subsumption tests between the query and summary table definition, and then performing compensation work once the optimizer decides that the summary table can be used for the answer.
A further evolution of materialized views according to the present invention is the ability to replicate a database-managed replica of a materialized view on each database partition of a table stored on a shared-nothing, massively parallel processing (MPP) computer system. These replicated materialized views improve the performance in situations where co-location of the base tables is not possible (as it will become evident in this invention), and yet the cost of having the data reside on every partition is small. This is typically useful for dimension tables in a data warehouse.
There are extensive research activities and literature on this topic, as disclosed in the following publications, all of which are incorporated by reference herein:
1. L. S. Colby, R. L. Cole, E. Haslam, N. Jazaeri, G. Johnson, W. J. McKenna, L. Schumacher, D. Wilhite. Red Brick Vista: Aggregate Computation and Management. Proceedings of the 14th Int""l. Conference on Data Engineering, Orlando, Fla., 1998.
2. R. Bello, K. Dias, A. Downing, J. Feenan, J. Finnerty, W. Norcott, H. Sun, A. Witkowski, M. Ziauddin. Materialized Views In Oracle. Proceedings of the 24th VLDB Conference, New York, 1998.
3. D. Srivastava, S. Dar, H. jagadish, A. Levy. Answering Queries with Aggregation Using Views. Proceedings of the 22nd VLDB Conference, Mumbai, India, 1996.
However, the current state of art does not address performance issues arising from the MPP environment. Thus, there is a need in the art for improved techniques for the replication of materialized views in an MPP environment.
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses a method, apparatus, and article of manufacture for optimizing database queries using a materialized view for a base table referenced in the query, wherein the materialized view has different properties than the base table. The query is rewritten to use the materialized view rather than the base table for optimal query performance.
The materialized view may be a vertical and/or horizontal subset of a base table, so that only selected columns and/or tuples from the table are present therein. Columns may be added to the materialized view to contain pre-computed results of expressions, and indices may be created on columns.
The materialized view itself may be replicated across the processors of the computer system. Alternatively, the materialized view may be partitioned across the processors of the computer system, wherein a partitioning key for the materialized view is different from that of the base table referenced in the query.
With the capability of transparent and automatic query rerouting, an optimizer of the RDBMS software has the freedom to choose different query execution strategies based on different properties, and hence, the properties of the base table are transparently altered using the materialized view for the purpose of query optimization.
It is an object of the present invention to optimize queries using materialized views that can be replicated and/or partitioned across multiple processors. More specifically, it is an object of the present invention to optimize RDBMS software using replicated and/or partitioned copies of materialized views.