1. Field of the Invention
The present invention concerns data base system generally and more specifically concerns side effects of transactions in data base systems.
2. Description of the Prior Art
A popular type of data base system is the relational data base system, in which the data in the data base is logically organized into relations, that is, tables with rows and columns. When a user of a data base system wishes to obtain information from it, the user makes a query, which describes the information to be obtained in terms of relations, columns, and limitations on the values of the columns in individual rows. The result of the query is another relation, which is termed a view. The relations from which the view is computed are termed the base relations of the view. If the view is in turn used as a base relation in the computation of a second view, the computation of the second view may be speeded up by materializing the view, that is, storing it in the data base system, rather than recomputing it each time it is used as a base relation.
In addition to querying data bases, users often add information to the data base and delete information from the data base. Adding or deleting information is termed a transaction. Of course, when a transaction changes a relation which is a base relation for a materialized view, the materialized view may need to be changed as well. One way of changing the materialized view is to recompute it from the base relations, but the whole point of the materialized view is to avoid recomputation. The problem of finding ways of updating materialized views without completely recomputing them is termed in the art the view maintenance problem.
Materialized views are however not the only components of data base systems which are affected by changes in base relations. Many data base systems have components which monitor changes in the data base and take actions if certain changes occur. Included among these components are integrity constraints, that is, rules for preventing transactions which are inconsistent with some property of the data base, monitors, which notify the user when certain changes occur, triggers, which take action when the changes occur, and active queries, which are computed in response to triggers. The same techniques which are used to avoid complete recomputation of views when a base relation changes can also be used generally to reduce the amount of computation needed to determine whether an action is to take place in response to a change in a base relation.
For the most part, prior art solutions to the view maintenance problem have been algorithmic, that is, for each kind of change to the base relations, an algorithm is provided which computes the changes to the view base relations, an algorithm is produced that computes the changes to the view. A typical example of the algorithmic approach is S. Ghandeharizadeh, R. Hull, D. Jacobs et al., On implementing a language for specifying active database execution models, In: Proc. VLDB-93. Qian and Widerhold have employed equational reasoning, that is, the change required in the view which is required by the change in the base relations is determined by means of a series of algebraic translations of the change in the base relations. See X. Qian and G. Wiederhold, Incremental Recomputation of Active Relational Expressions, IEEE Transactions on Knowledge and Data Engineering, 3(3):337"341,1991.
Equational reasoning offers a number of advantages over the algorithmic approach:
Unlike the algorithmic approach, it provides us with precise semantics of changes to the views. Consequently, using the equational approach makes it easier to prove correctness of the change propagation algorithm. PA1 This approach is robust: if language changes (e.g. new primitives are added), one only has to derive new rules for the added primitives, leaving all other rules intact. As long as the new rules are correct, the correctness of the change propagation algorithm is not affected. PA1 The resulting changes to the view are obtained in form of expressions in the same language used to define the view. This makes additional optimizations possible. For example, the expressions for changes that are to be made (e.g. for sets/bags of tuples to be deleted/added) can be given as an input to any query optimizer that might find an efficient way of calculating them.
Qian and Wiederhold's work and most of the algorithmic approaches have assumed that relations are set-valued, that is, duplicate rows are eliminated from the view. However, most practical database systems do not eliminate duplicates from views, and consequently cannot be modelled by sets. Instead, they must be modelled by bags, that is, multisets, or collections in which duplicates are permitted. The ability to handle duplicates is particularly important in dealing with aggregate functions in data base systems. Such functions obtain a value by aggregating other values. For instance, if the average salary of employees is to be computed, then one applies the aggregate function AVG to II.sub.Salary (Employees). Duplicates cannot be removed from the view since the result would be wrong when at least two employees had the same salary. Not eliminating duplicates also speeds up query evaluation, as duplicate elimination is generally a rather expensive operation.
Unfortunately, the theoretical work done on relations that are assumed to have no duplicates does not carry over to relations that do have them. Further, the little work which has been done on view maintenance when duplicates are present is algorithmic, see A. Gupta, I. S. Mumick, and V. S. Subrahmanian, Maintaining views incrementially, in: SIGMOD-93, pp. 157-166. Finally, Qian and Wiederhold's work contains an error which renders it unable to guarantee a minimality condition, namely that no rows are changed unnecessarily in the view being updated in response to a change in its base relations.
It is thus an object of the present invention to provide solutions for the view update problem which take the existence of duplicate rows in the view into account, which employ equational reasoning, and which are in fact able to guarantee a strong minimality condition.