1. Field of the Invention
This invention is related to computers and computer systems, and in particular to database management systems (DBMS).
2. Description of the Related Art
A database management system keeps track of changes made to a database during its lifetime. These changes are called "transactions" and each transaction is made up of operations such as insert, update, and delete. The operations are recorded in a "log file" in the order in which the operations are performed in the system. This log file may contain entries of operations performed from the time the database was created or from the time the database was last backed up until the present.
With the increasing emphasis in the database community on historical data, data warehousing, and updating remote or warehouse copies of a database, there is a need to develop a global log file that, when applied to a copy of a database, recreates the results of changes to the original database. As part of that task, it is necessary to first reconstruct the original order of transactions. For serial database systems, the transaction order can easily be inferred by examining the log files for each transaction; creating a global log file in that case is trivial. However, in parallel or distributed database systems, which consist of nodes each of which performs a part of a transaction, creating a global log file is much more complex. In such systems, after all the nodes have finished processing their parts of the transaction, the results are put together to complete the transaction. Each node has its own log file associated with it, often called a "local log file," in which all of the operations performed by the node are recorded in the order completed. On a "share-nothing" model of a parallel or distributed database system, each node operates on a partition of the database; the problem of constructing a global transaction order is especially complex in this case because any data item in the database belongs to exactly one partition and thus to exactly one node. FIG. 1 depicts the architecture of a share-nothing parallel database system in which the network 9 is made up of processing nodes 1,2,3,4 each respectively associated with partitions 5,6,7,8.
In order to better regulate the management of data, parallel or distributed database systems often operate under two protocols. The first is a "two-phase commit protocol," involving a transaction coordinator on some node and transaction participants on that node and others, in which each participant signals to the coordinator when the participant has completed its part of the transaction (first phase), and the coordinator signals to the participants whether the transaction should be executed or not (second phase). Specifically, the coordinator decides on which nodes lie the data items needed to complete the transaction and assigns a participant on each node to complete the operations necessary for each node. When each participant completes its part of the transaction, it sends a "vote" to the coordinator telling the coordinator that it has completed its work and requesting the coordinator to decide whether it should commit to the work it has done. After the coordinator receives a vote from all of the participants, it sends either a "commit" or an "abort" to each participant. The participants then commit to or abort their part of the transaction accordingly. For any transaction i, V.sub.i denotes its vote operation and C.sub.i denotes its commit operation. For these types of database systems, the local log files record the votes, commits, and aborts as well as the inserts, updates, and deletes. Thus, the two-phase commit protocol requires that in any node, the vote for a transaction will necessarily precede that transaction's commit or abort.
The second protocol under which these database systems operate is a "strict two-phase locking protocol" which controls and locks resources needed to complete database transactions. In general, a locking protocol requires the DBMS to lock resources needed to complete a transaction so that only that transaction can access the resources at any one time. A "two-phase locking protocol," consisting of a "locking phase" and a "releasing phase," restricts the acquisition and release of locks in such a way that all the locks must first be acquired for the transaction before any lock is released, and, once a lock is released, no other locks can be acquired for that transaction. A more stringent requirement results in the "strict two-phase locking protocol" where all the locks must first be acquired for the transaction and no locks are released until the commit or abort is processed for that transaction. In a sharenothing model of a parallel or distributed database system, each node independently abides by the strict two-phase locking protocol and thus requires that when two transactions i and j need to access the same resources on that node, the transaction that acquires the locks first will be committed to or aborted before the other transaction. Thus, if transaction i acquires the locks first, in the log file for that node the commit or the abort of i will precede the commit or abort of j.
In constructing a global transaction order, a set of transactions can be arranged into a "serialization order" which can either be a total order (a sequence relating any two transactions in the set) or a partial order (a sequence in which some transactions may be related to others but not necessarily). The term "serialization order" can also be used to describe the order in which two transactions i and j appear in the total or partial order. Related to "serialization order" is a "serialization requirement" which takes effect when a DBMS performs its operations so that it is only meaningful to view one transaction as preceding another. This requirement is related to the strict two-phase locking protocol described above in that if two transactions need to access a common data item, the database system will grant the lock to the data item first to one transaction and then to the other. In such a case there is a "serialization requirement" between the two transactions that "requires" the transaction that acquired the lock first to be ordered before the other transaction. Finally, each log file has a "serialization implication" which is made up of information regarding the serialization requirement carried by the system. This information includes the records of the sequence and type of operations performed on its associated node, together with the semantics of the system in which the operations are executed.
The combination of all the local log files in the system reflects the history of the entire database. In constructing a global order at some later time, only the local log files exist. The most obvious solutions to this delayed ordering problem are to examine either the timestamp information for each transaction or the data items accessed by each transaction, information that may be recorded in the local log files. However, in practice these methods do not work very well because global timestamps are not always available in every DBMS and examining data items accessed by each transaction involves too many log entries and a complex analysis.
In situations such as these when timestamp information or information relating to each data item is not available, one method used to construct a global order is to examine the order of commit entries in the local log files. However, the commit entries alone do not provide enough information for constructing a global transaction order. In a parallel or distributed database system, a database node records its commit entries in its log file in the order it processes them, not in the order imposed by the system serialization requirement. Thus, for two transactions i and j whose commit entries are recorded in the same local log file, it is possible that (1) i must be serialized before j; (2) j must be serialized before i; or (3) no serialization requirement exists between i and j. In addition, the commit entries of two transactions have several properties that may lead to an ambiguous ordering. First, the order of two commit entries in the log file does not necessarily imply their serialization requirement because the commit entry of one transaction may precede that of another in some log file simply because it is processed earlier than the other, and the two transactions may not access any common data item at all. Second, two different log files may record the commit entries of the same two transactions in different orders. Thus, examining the positions of commit entries in the log files alone does not provide enough information to construct a global serialization order.
This lack of information creates local log files that are often ambiguous, and it is difficult to infer a global order merely by examining the order of the individual commit operations in each local log file. Up to now there been no method to construct a global order from local log files where the construction is performed some time after the log files have been created and can be completed on the full log file in one pass-through.