Many database systems are configured to support specialized types of hypothetical queries which are useful in application areas such as, for example, version management, decision support, active databases and integrity maintenance. Most existing approaches provide support for hypothetical queries only in specialized forms. Frameworks for expressing hypothetical queries for general-purpose usage have been discussed in S. Ghandeharizadeh, R. Hull and D. Jacobs, "Elevating Deltas to be First-Class Citizens in a Database Programming Language," ACM Trans. on Database Systems, 21(3): 370-426, 1996, and J. Woodfill and M. Stonebraker, "An Implementation of Hypothetical Relations," Proc. of Intl. Conf. on Very Large Databases, pp. 157-165, September 1983, both of which are incorporated by reference herein. For example, a simple hypothetical query as described in the above-cited S. Ghandeharizadeh et al. reference may be of the form Q when .eta., where Q is a conventional database query and .eta. denotes a hypothetical database state expression that may be a hypothetical update expression {U}, an explicit substitution of the form {Q.sub.1 /R.sub.1, . . . , Q.sub.n /R.sub.n }, or any other suitable hypothetical-state expression. Consider a department store database which maintains the following two relations in the form of stored tables: Depts[dept.sub.-- name, mgr.sub.-- name], and Products[ prod.sub.-- name, price, color, dept.sub.-- name]. The relation Depts stores identifiers of departments (dept.sub.-- name) and their managers (mgr.sub.-- name), while the relation Products stores the name (prod.sub.-- name), price, color and selling department (dept.sub.-- name) of each product. An example of a simple hypothetical query to this database may be expressed informally as follows: [What managers sell at least one product with price.gtoreq.100] when [Move all green products to Garden department]. More complex hypothetical queries may include multiple when constructs or multiple nested when constructs.
Well-known direct evaluation techniques are often used to process the non-hypothetical conventional query Q. A direct evaluation of Q is an evaluation in which the actual operators mentioned in Q are performed directly. It is known that given a query Q, there may be some other query Q' which is equivalent to Q in that Q' will give the same answer as Q, but where the direct evaluation of Q' is much faster than the direct evaluation of Q. Using this fact, conventional direct evaluation of a query Q may be more efficiently implemented using an optimization step and an evaluation step. The optimization step includes generating a set of queries Q' that are equivalent to Q using "rewriting rules" based on algebraic equivalences, estimating the amount of time it would take to directly evaluate the various queries Q', and selecting a particular query Q' that appears to be fastest to directly evaluate. The evaluation step involves directly evaluating the selected equivalent query Q'. These techniques for evaluation of non-hypothetical queries are generally referred to as "lazy" evaluation techniques.
Although the efficient evaluation of non-hypothetical queries by the above-described lazy evaluation and other techniques is well developed, little work has been done in the area of efficient evaluation of hypothetical queries. More particularly, lazy evaluation techniques have generally not been applied to evaluation of hypothetical queries. Instead, existing techniques for evaluating hypothetical queries are generally "eager" with regard to the evaluation of the hypothetical states. An eager evaluation approach, when given a query of the form Q when .eta., generally computes and materializes in the database a representation of the hypothetical database state .eta., or a representation of the net change, or delta, called for by .eta.. The conventional query Q is then evaluated using a standard technique such as the above-described direct evaluation, and the result is "filtered" against the materialized representation of .eta.. Exemplary eager evaluation techniques are used in the Heraclitus language described in the above-cited S. Ghandeharizadeh et al. reference, and M. Doherty, R. Hull and M. Rupawalla, "Structures for Manipulating Proposed Updates in Object-Oriented Databases," Proc. ACM SIGMOD Symp. on the Management of Data, pp. 306-317, 1996, which is also incorporated by reference herein. Eager evaluation techniques are also used in conventional version management systems, and in implementations of hypothetical relations as described in the above-cited J. Woodfill and M. Stonebraker reference.
A conventional optimizer may be used in conjunction with eager evaluation to optimize a hypothetical query Q when .eta. in the manner previously described, that is, to find a Q' equivalent to Q such that direct evaluation of Q' filtered by .eta. is faster than direct evaluation of Q filtered by .eta.. Other techniques have been developed for computing and representing a hypothetical state .eta. quickly, and in manner which attempts to ensure that the filtering operation is relatively fast. Existing eager evaluation techniques thus attempt separate optimizations of the conventional query Q and the hypothetical database state .eta., but generally assume that the when construct will be evaluated directly through the above-noted filtering mechanism. The existing eager techniques thereby fail to provide sufficiently fast evaluation of hypothetical queries in many important applications.
As noted above, lazy evaluation strategies, although utilized in other database management applications such as in evaluating weakest preconditions, have generally not been applied to evaluation of hypothetical queries. Evaluation of weakest preconditions is described in greater detail in, for example, E. W. Dijkstra, "Guarded Commands, Nondeterminacy and Formal Derivations of Programs," Comm. ACM, 18:453-457, 1975, E. W. Dijkstra, "A Discipline of Programming," Prentice-Hall, 1976, D. Gries, "The Science of Programming," Springer-Verlag, 1981, X. Qian, "An Axiom System of Database Transactions," Information Processing Letters, 36:183-189, 1990 and X. Qian, "The Expressive Power of the Bounded-Iteration Construct," Acta Informatica, 28(7):631-656, October 1991, all of which are incorporated by reference herein. Other lazy evaluation techniques have been developed in logic programming and datalog applications, as described in, for example, D. M. Gabbay and U. Reyle, "N-Prolog: An Extension of Prolog with Hypothetical Implications," Journal of Logic Programming, 1(4):319-355, 1984 and 2(4):251-283, 1985, A. J. Bonner, "Hypothetical Datalog: Complexity and Expressibility," Theoretical Computer Science, 76:3-51, 1990, A. J. Bonner, "The Logical Semantics of Hypothetical Rulebases with Deletion," Journal of Logic Programming, 1995, all of which are incorporated by reference herein. However, these and other lazy or hybrid lazy-eager evaluation techniques, although not requiring materialization of hypothetical states, fail to provide a sufficiently general framework for optimizing hypothetical query evaluation in a variety of applications.
It is therefore apparent that a need exists for improved techniques for evaluating hypothetical database queries, such that the hypothetical queries can be processed more efficiently than is possible using the existing techniques described above.