The present invention relates to a method and system for incremental database maintenance, and in particular to updating materialized views in a database.
In a data warehouse, views are computed and stored in the database to allow efficient querying and analysis of the data. These views stored at the data warehouse are known as materialized views. In order to keep the views in the data warehouse up to date, it is necessary to maintain the materialized views in response to the changes at the sources. The view can be either recomputed from scratch, or incrementally maintained by propagating the base data changes onto the view so that the view reflects the changes. Incrementally maintaining a view can be significantly cheaper than recomputing the view from scratch, especially if the size of the view is large compared to the size of the changes.
The problem of finding such changes at the views based on changes to the base relations has come to be known as the view maintenance problem and has been studied extensively. Traditional maintenance techniques propagate insertions and deletions from the base relations to the view through each of its operations. Several improved schemes have been proposed for incremental maintenance of view expressions. However, problems arise with traditional techniques and with prior improved schemes. For example, none of the previously known schemes efficiently handles the case of general view expressions involving aggregate and outerjoin operators. As another example, most of the incremental maintenance approaches compute and propagate insertions and deletions at each node in a view expression tree, which is often inefficient.
A need arises for an incremental maintenance technique which efficiently handles general view expressions involving aggregate and outerjoin operators.
The present invention is a method and system for incrementally maintaining a database having at least one materialized view based on at least one table. When changes to the table are received, a change table based on the received changes is generated. The generated change table is propagated upwards to form a higher-level change table and the materialized view is updated by applying the higher-level change table to the materialized view using a refresh operation.
In one aspect of the present invention, the change table includes a plurality of tuples representing the changes and the materialized view includes a plurality of tuples. The refresh operation has two parameters, a join condition and an update function specification. The materialized view is updated by finding all tuples in the materialized view that match the tuple in the change table, using the join condition, for each tuple in the change table and updating each found tuple in the materialized view by performing operations indicated by the update function specification.
In one embodiment of the present invention, the materialized view is an aggregated materialized view. In this embodiment, each tuple in the materialized view and in the change table includes at least one aggregated attribute and may include one or more non-aggregated attributes. Tuples in the materialized view that match the tuple in the change table are found, using the join condition, by matching the non-aggregated attribute of a tuple in the change table, if any, with the non-aggregated attribute of a tuple in the view. Each found tuple in the materialized view is updated by updating the aggregated attribute of the tuple in the view using the aggregated attribute of the tuple in the change table.
In one aspect of this embodiment, the materialized view is further updated by inserting a tuple from the change table into the materialized view, if no tuples are found in the materialized view that match the tuple from the change table, and deleting a tuple from the materialized view, if an aggregated attribute representing a count of a number of tuples in a group represented by a the tuple becomes zero.
In another embodiment of the present invention, the materialized view is an outerjoin materialized view based on a plurality of tables. In this embodiment, there is a tuple in the materialized view corresponding to each value of at least one selected attribute in any of the plurality of tables. Each tuple in the materialized view comprises at least one selected attribute and at least one other attribute. Tuples in the materialized view that match the tuple in the change table are found, using the join by matching the at least one selected attribute of a tuple in the change table with the at least one selected attribute of a tuple in the view. Each found tuple in the materialized view is updated by updating the at least one other attribute of the tuple in the view using the at least one other attribute of the tuple in the change table.
In one aspect of this embodiment, the materialized view is further updated by inserting a tuple from the change table into the materialized view, if no tuples are found in the materialized view that match the tuple from the change table and deleting a tuple from the materialized view, if a value of the at least one selected attribute of the tuple is no longer present in any table.