A Multidatabase System (MDBS) is a facility that supports global applications accessing data stored in multiple databases. It is assumed that the access to these databases is controlled by autonomous and (possibly) heterogeneous Local Database Systems (LDBSs). The MDBS architecture allows local transactions and global transactions to coexist. Local transactions are submitted directly to a single LDBS, while the multidatabase (global) transactions are channeled through the MDBS interface. The objectives of multidatabase transaction management are to avoid inconsistent retrievals and to preserve the global consistency in the presence of multidatabase updates. The concept used to evaluate whether a multidatabase transaction management function preserves global consistency is serializability. A concurrent execution of transactions is serializable if it produces the same output and has the same effect on the database as some serial execution of the same transactions. Furthermore, a global execution is serializable if there exists a total order which is compatible with all the local serialization orders of the global transactions at the participating local database systems. These objectives are more difficult to achieve in MDBSs than in homogeneous distributed database systems because, in addition to the problems caused by data distribution and replication that all distributed database systems have to solve, transaction management in MDBSs must also cope with heterogeneity and autonomy of the participating LDBSs.
In a multidatabase environment, the serializability of local schedules is, by itself, not sufficient to maintain the multidatabase consistency. To assure that global serializability is not violated, earlier proposals require the MDBS to validate local schedules or the concurrency of global transaction processing is severely restricted. However, the local serialization orders are neither reported by the local database systems, nor can they be determined by controlling the submission of the global subtransactions or by observing their execution order. To determine the serialization order of the global transactions at each LDBS, the MDBS must deal not only with direct conflicts that may exist between the subtransactions of multidatabase transactions but also with the indirect conflicts that may be caused by the local transactions. Since the MDBS has no information about the existence and behavior of the local transactions, it is difficult to determine if an execution of global and local transactions is globally serializable.
To illustrate this point consider FIG. 1 which illustrates two multidatabase transactions G.sub.1 and G.sub.2, and a local transaction T.sub.1 in a prior art multiprocessor database system (MDBS) 21 having an MDBS processor 22 and multiple local database systems of which two, LDBS.sub.1 and LDBS.sub.2, are depicted. The two global transactions G.sub.1 and G.sub.2 are typically requested by one or more system users whereas the local transaction T.sub.1 would typically be requested by a user of the local database. In this example global transaction G.sub.1 is comprised of two subtransactions; one of the subtransactions writes .beta. to LDBS.sub.2 and the other subtransaction reads .alpha. in LDBS.sub.1. Global transaction G.sub.2 is also comprised of two subtransactions; the first subtransaction writes .alpha. to LDBS.sub.1 and the second reads .delta. from LDBS.sub.2. Local transaction T.sub.1 is comprised of two operations; a write of .delta. and a read of .beta. both in LDBS.sub.2. In FIG. 1 the transaction G.sub.1 writing data item .beta. is shown as a path to .beta. from G.sub.1. The path to G.sub.1 from .alpha. denotes that G.sub.1 reads .alpha.. This notation is used to depict read and write operations. In our example, the global transactions have subtransactions in both LDBSs. In LDBS.sub.1 because G.sub.1 reads .alpha. and G.sub.2 writes it, G.sub.1 and G.sub.2 directly conflict and the serialization order of the transactions is G.sub.1 .fwdarw.G.sub.2. In LDBS.sub.2 because G.sub.1 and G.sub.2 access different data items there is no direct conflict between G.sub.1 and G.sub.2 in LDBS.sub.2. However, since the local transaction T.sub.1 reads .beta. and writes .delta., G.sub.1 and G.sub.2 conflict indirectly in LDBS.sub.2. In this case, the serialization order of the transactions in LDBS.sub.2 becomes G.sub.2 .fwdarw.T.sub.1 .fwdarw.G.sub.1. Now the global conflict is apparent. In the second transaction G.sub.2 proceeds G.sub.1 whereas in the first transaction the reverse is true. In summary:
Transactions at LDBS.sub.1 =G.sub.1 reads .alpha., G.sub.2 writes .alpha.; Serialization order: G.sub.1 .fwdarw.G.sub.2. Transactions at LDBS.sub.2 =T.sub.1 reads .beta., G.sub.1 writes .beta., G.sub.2 read .delta., T.sub.1 writes .delta., Serialization order: G.sub.2 .fwdarw.T.sub.1 .fwdarw.G.sub.1.
In a multidatabase environment the MDBS has control over the execution of global transactions and the operations they issue. Therefore, the MDBS can detect direct conflicts involving global transactions, such as the conflict between G.sub.1 and G.sub.2 at LDBS.sub.1 in FIG. 1. However, the MDBS has no information about local transactions and the indirect conflicts they may cause. For example, since the MDBS has no information about the local transaction T.sub.1, it cannot detect the indirect conflict between G.sub.1 and G.sub.2 at LDBS.sub.2. Although both local schedules are serializable, the transactions are globally non-serializable, i.e. there is no global order involving G.sub.1, G.sub.2 and T.sub.1 that is compatible with both local schedules.
In the early work in this area the problems caused by indirect conflicts were not fully recognized. In their early paper, Gligor and Popescu-Zeletin ("Concurrency control issues in distributed heterogeneous database management systems", in Distributed Data Sharing Systems, North-Holland, 1985), stated that a schedule of multidatabase transactions is correct if multidatabase transactions have the same relative serialization order at each LDBS where they (directly) conflict. Breitbart and Silberschatz (Proceedings of SIGMOD International Conference on Management Data, June 1988) have shown that the above correctness criterion is insufficient to guarantee global serializability in the presence of local transactions. They proved that the sufficient condition for the global consistency requires the multidatabase transactions to have the same relative serialization order in all sites where they execute. The problem then becomes how does the MDBS ensure that multidatabase transactions have the same relative serialization order at all local sites if the MDBS is not aware of all direct and indirect conflicts?
Several solutions have been proposed in the prior art to deal with this problem, however, most of them are not satisfactory. The main problem with the majority of the proposed solutions is that they do not provide a way of assuring that the serialization order for the global transactions is the same as that in all the local serialization orders without violating the autonomy of the local databases.
Alonso, Garcia-Molina, and Salem in "Concurrency control and recovery for global procedures in federated database system", Quarterly Bulletin of the IEEE Computer Society technical committee in Data Engineering, September 1987, propose the use of site locking in the altruistic locking protocol to prevent undesirable conflicts between multidatabase transactions. Given a pair of multidatabase transactions G.sub.1 and G.sub.2, the simplest altruistic locking protocol allows the concurrent execution of G.sub.1 and G.sub.2 if they access different LDBSs. If there is a LDBS that both G.sub.1 and G.sub.2 need to access, G.sub.2 cannot access it before G.sub.1 had finished its execution there. However, Du, Elmagarmid, Leu and Osterman in "Effects of autonomy on maintaining global serializability in heterogeneous distributed database systems", Proceedings of the Second International Conference on Data Knowledge Systems for Manufacturing and Engineering, October, 1989, show that global serializability may be violated even when multidatabase transactions are submitted serially to their corresponding LDBSs. The scenario in FIG. 1 illustrates this problem. G.sub.1 is submitted to both sites, executed completely and committed. Only then is G.sub.2 submitted for execution; nevertheless the global consistency may be violated.
Another solution is proposed by Wolski and Veijalainen, "2PC Agent method: Achieving serializability in presence of failures in a heterogeneous multidatabase", Proceedings of PARBASE-90 Conference, February 1990. They propose that if all the LDBSs use two phase locking (2PL), a strict scheduling algorithm, the strict LDBSs will not permit local executions that violate global serializability. However, even local strictness is not sufficient. To illustrate the problem consider again the transactions in FIG. 1 with the following local schedules: in LDBS.sub.1, G.sub.1 reads .alpha., commits G.sub.1, G.sub.2 writes .alpha., and commits G.sub.2 ; in LDBS.sub.2, G.sub.1 obtains a read lock on .beta., G.sub.1 reads .beta., G.sub.2 obtains a read lock on .beta., reads .beta., then releases the read lock on .beta. and doesn't obtain any more locks, G.sub.1 obtains a write lock on .beta., writes .beta., then G.sub.1 releases all its locks, G.sub.1 commits, and then G.sub.2 commits. The serialization order in LDBS.sub.1 is G.sub.1 .fwdarw.G.sub.2, whereas in LDBS.sub.2 the serialization order is G.sub.2 .fwdarw.G.sub.1. Both schedules are strict and are allowed by 2PL, but the global serializability is violated.
U.S. Pat. No. 4,881,166 (Thompson and Breitbart, "Method for Consistent Multidatabase Transaction Processing") proposes detecting for conflicts using site cycles. If there is a conflict found involving read operations at a site, a search for a new site is conducted. If a conflict is found involving the write operations then the transaction is aborted. In this patent, the MDBS using site graphs has no way of determining when it is safe to remove the edges of committed global transactions. The method may work correctly if the removal of the edges corresponding to committed transactions is delayed, however, then concurrency would be sacrificed.
C. Pu "Superdatabases for composition of heterogeneous databases", IEEE Proceedings of the 4th International Conference on Data Engineering, 1988, demonstrates global serializability can be assured if the LDBSs present the local serialization orders to the MDBS. Therefore, since traditional database management systems do not provide their serialization order, Pu proposes modifying the LDBSs to present the local serialization orders to the MDBS. However, this solution violates the local autonomy of the LDBSs.
It is therefore, the objective of our invention to enforce global serializability of transactions in a multidatabase system without violating the autonomy of the local databases.