1. Technical Field
The present invention generally relates to data processing systems and in particular to database query optimization in data processing systems.
2. Description of the Related Art
As data volume grows rapidly in many data warehouse systems, new database optimization techniques are sought to improve the performance of queries against these data warehouse systems. One way to improve the query performance is to reduce the size of input data needed to process a query. For example, if an annual sales total query is issued against a data warehouse fact table whose data grain is at a daily level, 365 daily sales records would be required to derive a yearly sales result. However, if quarterly sales data were pre-computed against this same fact table and the quarterly sales data were stored inside a table, this pre-computed quarterly sales data may be used to compute the yearly sales data. In this case, only 4 quarterly sales records will be needed to derive this yearly sales data, a sizable reduction from the original 365 daily sales records.
A key enabler of this kind of powerful query optimization technique lies at the mathematical equation used in the calculation, in which equivalent query results can be derived from some intermediate query results pre-calculated from the same set of input data. For example, if the mathematical equation involved in a query is a SUM function, then SUM over 100 raw data points is equivalent to a SUM of two Subtotals such that each Subtotal is a SUM over 50 original raw data points. Or a SUM over 100 raw data points is equivalent to a SUM of four Subtotals such that each Subtotal is a SUM over 25 original raw data points. If these subtotals are pre-calculated and stored, these subtotals may be used to help compute the SUM over 100 raw data points in an efficient manner.
In a relational database system, these subtotals may be pre-aggregated for certain measures and the results stored into a table. This result table is called a Materialized Query Table (MQT) in a database collection (e.g., International Business Machine's (IBM's) database 2 (DB2)) and a Materialized Views (MV) in Oracle. The query used to compute these results is called the definition query of this MQT (or MV). For convenience, this pre-aggregate technology in a relational database may be referred to as MQT technology from here on.
Though the MQT technology can be applied to any queries, the MQT technology is extremely popular in business intelligence (BI) applications as queries generated from these applications often involve some kind of aggregations. Therefore, the performance of BI applications is heavily influenced by the performance of the query processing component of a relational database engine that, in turn, is heavily influenced by the performance of its MQT (or MV) technology. Though the MQT technology has been proven to provide amazing query performance by re-using some pre-computed query results, a relational database engine places a higher premium on the reliability and accuracy of these equivalent query results.