1. Field of the Invention
The present invention relates generally to relational database systems, and more particularly to systems and methods for rdbms view maintenance.
2. Description of the Related Art
Relational database systems store large amounts of data in table format, including business data that can be analyzed to support business decisions. For example, a relational database system can be used to store sales data including information on sales by time period, product, product line, geographic area, and so on, which can be usefully presented in response to a query to support a particular business decision. It will readily be appreciated that data can be presented in numerous ways to support a decision depending on the particular decision (and, hence, query) being made.
To respond to a user query, relational database systems often undertake table joins, i.e., operations that relate data in two or more tables. In a relational database, a xe2x80x9cviewxe2x80x9d is a relation that is derived from other relations, and views can be queried, with new views being derived thereby. A view may be xe2x80x9cmaterializedxe2x80x9d by storing the view""s tuples (data) in the database, to thereby promote more efficient querying, since a view""s tuples need not be re-derived for every query involving the view.
As recognized herein, a materialized view can become stale unless it is refreshed, i.e., updated, as the underlying tables are changed (by, e.g., deleting a record or adding a record). Executing queries on stale views can result in out of date query results. Completely recomputing a view, however, can take considerable time away from other processes that use the RDBMS, such that it is often desirable to incrementally update a materialized view.
The present invention recognizes several problems with current incremental view update methods. First, they are atomic, meaning that the transaction updating the view must have available a consistent snapshot of the underlying tables. If the view involves table joins as is often the case, the update transaction must join several combinations of base table and xe2x80x9cdeltaxe2x80x9d tables, i.e., tables that record recent changes to the base tables. This results in a long-lived update transaction, in turn resulting in contention between the view update process and concurrent updates of the underlying tables of the view. This drawback looms larger in e-commerce applications, wherein databases can be continuously accessed around the clock.
Second, incremental update transactions heretofore were synchronous with the update interval. In other words, the view updates had to be calculated at a specific time, because the base tables as they exist at the update time have been required. Thus, it is not possible to decide, say, at 8 P.M. to refresh a materialized view from its 4 P.M. state to its 5 P.M. state. because at 8 P.M. data in the underlying tables may have been changed since 5 P.M. Consequently, the decision to refresh the view must be made at 5 P.M. under these hypothetical circumstances, and the full costs of the update must be paid then even though more resources might be available later on when the computing load is lighter.
Simply refreshing the view using very small refresh intervals, e.g., within every transaction can address some of the above-noted problems but is attended with its own. For instance, updates at very small intervals forces the materialized view to track real time very closely, which can be impractical or even undesirable for some applications, e.g., when the materialized view represents daily sales results.
Moreover, the use of short refresh intervals does not address the synchronization problem noted above. Breaking the process up into two componentsxe2x80x94propagation phase, wherein the view update is evaluated, and apply phase, wherein the update is applied to the viewxe2x80x94only partially addresses the synchronization problem, since the apply phase may be undertaken asynchronously but the propagation phase must still be executed synchronously at the target time of the view refresh absent the inventive features set forth herein. In any case, the present invention recognizes the above-noted problems and provides the solutions disclosed herein.
A general purpose computer is programmed according to the inventive steps herein to undertake updates of materialized database views, regardless of whether the updates are required because of select-project-join views, or views involving unions. The invention can also be embodied as an article of manufacturexe2x80x94a machine componentxe2x80x94that is used by a digital processing apparatus and which tangibly embodies a program of instructions that are executable by the digital processing apparatus to execute the present logic. This invention is realized in a critical machine component that causes a digital processing apparatus to perform the inventive method steps herein.
The invention can be implemented by a computer system including a general purpose computer that executes method acts to refresh a view table. These method acts include deriving changes to at least one base table accessible to the computer to render at least one delta table, with the base table being combinable with other tables to render the view table. Also, the method includes associating at least one record timestamp with at least one record in at least one delta table and combining at least one delta table with at least one base table of the join view to render a combination table. Then, the method contemplates computing at least one combination timestamp representing the combining act. Using the combination timestamp, at least one compensation table to the combination table is computed. At least one compensation table is then combined with at least one combination table to render a view delta table useful for refreshing the materialized view table.
In a preferred embodiment, the view table is materialized, and the view delta table accounts for transactions from the time of the current view materialization to a time ttarget. The rendering of the view delta table is undertaken at a time teval that is later than ttarget, such that the view delta table is rendered asynchronously with respect to ttarget. The evaluation time teval can be dynamically established during each repetition of the logic, such that changes to the view are propagated using a series of transactions the sizes of which are established by the selection of the ranges of timestamps of delta table rows for each repetition. Tuples in the view delta table are timestamped, such that the changes that are timestamped after the time of the current view materialization and before a time thighwater are valid to apply to the materialized view. The act of deriving changes to at least one base table to render at least one delta table accounts for transactions from a time txcex4 prior to at least the time ttarget. Moreover, tuples in each combination table and compensation table are associated with counts, and the count of each tuple is the product of the counts of the tuples from which it was derived, whereas the timestamp of each tuple is the minimum of the timestamps of the tuples from which it was derived. Accordingly, the combination tables and compensation tables in the view delta table have count columns and timestamp columns. When an even number of delta tables are involved in the computation of a compensation table, the counts of the tuples of the compensation table are negated before combining the compensation table with the view delta table.
If desired, the view table can be associated with at least three base tables, and the computer derives changes to the three base tables by accounting for transactions to three respective times tnew1, tnew2, and tnew3, that can be different from each other. Under these circumstances, compensation is undertaken recursively.
In another aspect, a computer-implemented method is disclosed for refreshing a materialized view. The method includes evaluating changes for an intended time tintended to at least one base table associated with the view at an evaluation time teval that is later than the intended time. The method also includes recursively compensating for changes made to base tables between the intended time tintended and evaluation time teval, and then applying the results of the evaluation and compensation to the materialized view to refresh the view.
In still another aspect, a computer program device includes a computer program storage device that is readable by a digital processing apparatus. A program is on the program storage device and it includes instructions that can be executed by the digital processing apparatus to incrementally compute changes to a materialized join view which refect changes to base tables. The program includes computer readable code means for recording updates to a plurality of base tables as they occur. Computer readable code means, at subsequent times, extract the updates that occurred to selected base tables over given time intervals. Also. computer readable code means are provided for combining extracted updates with the base tables in multiple steps to compute changes to the materialized join view.
The details of the present invention, both as to its structure and operation, can best be understood in reference to the accompanying drawings, in which like reference numerals refer to like parts, and in which: