1. Technical Field
The present invention relates generally to relational databases and, more particularly, to a system and method for real-time materialized view maintenance for relational databases.
2. Description of the Related Art
Recently, there has been a growing trend to use data warehouses to make real-time decisions about a corporation's day-to-day operations. Most major relational database management system (RDBMS) vendors have spent great efforts on real-time data warehousing, including IBM's business intelligence, MICROSOFT's digital nervous system, ORACLE's Oracle10g, NCR's active data warehouse, and COMPAQ's zero-latency enterprise.
A real-time data warehouse needs to handle real-time, online updates in addition to the traditional data warehouse query workload. This raises a problem that is present to a lesser degree in traditional data warehouses, namely when a base relation is updated, maintaining the materialized view(s) defined on it can bring a heavy burden to the corresponding RDBMS.
To mitigate this problem, several methods have been proposed to detect irrelevant updates to a base relation R that do not affect the materialized view MV defined on R. For example, see the following, which are each incorporated by reference herein: Blakeley et al, “Updating Derived Relations: Detecting Irrelevant and Autonomously Computable Updates”, ACM Transactions on Database Systems (TODS), 1989, 14 (3), pp. 369-400; Blakeley et al., “Efficiently Updating Materialized Views”, ACM International Conference on Management of Data (SIGMOD), 1986, pp. 61-71; and Levy et al., “Queries Independent of Updates”, International Conference on Very Large Data Bases (VLDB), 1993, pp. 171-181. However, all of these methods are “content-independent” in the sense that they only consider the “where” clause condition in a materialized view's definition while ignoring the content in the other base relations of the materialized view. As a result, these methods make over-conservative decisions and miss a large number of filtering opportunities.
For example, consider the following materialized view MV:
create materialized view MV as
select * from R, S, T
where R·a=S·b and S·c=T·d
and R·e>20 and S·f=“xyz” and T·g=50;
Assume that a materialized view MV records anomaly exists so that very few tuples in R, S, and T satisfy the where clause condition (R·a=S·b and S·c=T·d and R·e>20 and S·f=“xyz” and T·g=50) in the MV's definition. Suppose a tuple tR whose tR·e=30 is inserted into base relation R. Since tR·e>20, the existing prior art methods in the above-referenced articles cannot tell whether or not the MV will change. Therefore, the standard materialized view maintenance method has to be used, as follows. S is checked for a matching tuple(s) tS such that tS·b=tR·a and tS·f=“xyz”. If such a matching tuple tS exists, then T is further checked for matching tuple(s) tS such that tS·d=tS·c and tT·g=50. If both S and T are large and cannot be cached in memory, then such checking can incur a large number of input and output operations and become fairly expensive. However, because of the MV records anomaly, it is most likely that the insertion of tR into R will not affect the MV and, thus, all of the expensive checking is wasted.