The present invention concerns temporal databases and, more particularly, relates to time stamping of data items associated with a transaction.
A conventional database relation contains sets of records. Insertions, updates and deletions occur over time. It is frequently important to retain a perfect record of past database states. In many applications, such as financial, banking, insurance and medical applications it is desirable to maintain a database that can access not only the current set of records, but also a set of records that was current at a specified time. For example, the practice in financial applications is for accountants to correct errors, not by using an eraser, but by posting compensating transactions to the books. A bank must be able to determine the balance of a customer""s account as of a certain day. In medical applications, the basis on which decisions were made are documented to guard against malpractice lawsuits.
A database having transaction time support allows not only the current state of the database to be accessed, but also previous states of the database to be accessed. A transaction time supported database provides accountability or traceability of the database records, which is important in databases used in financial, insurance, medical and other applications.
A transaction time relation consists of a set of data items, which may be thought of as a set of records. A start time d.TTand an end time d.TTare maintained by the system for each data item d. The start time d.TTand the end time d.TTdefine the time interval (d.TT, d.TT) during which each data item d was part of the current database state. This is a semi-open interval that includes d.TTbut excludes d.TT, ensuring that a data item never has more than one value at any time. The start time d.TTrecords the time when the data item d became part of the current state of the database and the end time d.TTrecords when the data item d ceased to be part of the current state of the database.
A transaction that inserts the data item d sets the start time d, also referred to a write timestamp, TTto the current time tcurrent. The insertion sets the end time d.TTto a variable now that is continuously updated to the current time. The data item having a start time d.TTand an end time d.TTset to the variable now is a current data item d within the database. The data item d remains a current data item until the data item d is explicitly deleted by a delete or update transaction. A transaction that deletes the data item d sets the end time d.TTto the current time tcurrent, indicating that the data item d ceased to be current at the transaction time tcurrent. Update transactions are typically implemented as deletions of the original data item d to be updated followed by an insertion of the updated data item d.
A common query in a database having transaction time support is termed a time slice. A time slice asks for the set of data items that were current at some past time t. The time slice query is answered by finding each data item d having a start time d.TTon or before the past time t requested and an end time d.TTthat was on or after the time t requested. The time slice at time t returns the state of the database that was current at the past time t.
When user specified transactions are supported, it is necessary to use the same timestamp value for all statements in the same transaction. It would be possible for a time slice to return an inconsistent database state if the same timestamp is not used for all statements in a single transaction. Using the same timestamp for all statements of a transaction makes all actions of a transaction conceptually take place at the same time. However, the SQL standard currently allows different statements in the same transaction to use separate timestamp values.
The specific choice of the time that is used for timestamping the data items of a transaction is essential to ensuring that any previous database state that can be retrieved via a time slice at a past time t is indeed the database state that was current at the past time t. If the transaction timestamp order does not agree with the serialization order of the transactions, it is possible that the time slice will return a database state that never existed as a current database state.
In the SQL database language, a query or modification can reference the current time, tcurrent. The current time can be stored as an attribute in the database or used to query the database. For example, the current time tcurrent can be used to retrieve the state of the database that was current ten minutes ago. Referencing the current time tcurrent in a query can force the database management system to choose this time before a transaction commits. Choosing the current time before the transaction commits exposes the transaction to the risk that it will subsequently be discovered that the transaction timestamp given to the transaction and the timestamp values given to other transactions are not ordered in a way that is consistent with a valid transaction serialization order. Table 1 is a schedule of two transactions T1 and T2 that choose the current time as the transaction timestamp early in the transaction.
Referring to Table 1, at Time 1 the timestamp for transaction T1 timestamp is fixed at 1. At Time 2, transaction T1 writes data item x and sets the start time for the data item x to 1 (the transaction""s timestamp) and sets the end time for data item x to the continuously updating variable now, since the data item x is current. At Time 3, the transaction T2 timestamp is fixed at 3. At Time 4, transaction T2 writes the data item y and sets the start time for data item y to 3 (the transaction""s timestamp) and sets the end time for y to the variable now, since the data item y is current. At Time 5, transaction T2 commits. At Time 7, transaction T1 reads the data item y. At Time 8, transaction T1, writes data item z and sets the start time for data item z to 1, transaction T1""s timestamp. The end time for data item d3 is set to the variable now, since the data item d3 is current.
Table 1 illustrates that a time slice may not produce an accurate representation of the database at a given time t if the transaction timestamp order does not agree with serialization order. The transaction timestamp order shown in Table 1 is transaction T1, which fixes its timestamp at time 1, followed by transaction 2, which fixes its timestamp at Time 3. However, there is a read-write conflict between transaction T1 and transaction T2. Transaction T1 reads data item y that is written by transaction T2. This read-write conflict puts transaction T2 before transaction T1 in any serializable schedule involving these two transactions, since data item y is written by transaction T2 before data item y is read by transaction T1. As a result, the transaction timestamp order does not agree with the serialization order of the two transactions.
For simplicity of exposition, we elide the writes done to delete old versions when updates occurs. A read-write conflict on data item y between transactions T1 and T2puts T2before T1 in any serializable schedule involving these two transactions. In addition, T1 chose its tcurren value at Time 1, in preparation for the write statement at Time 2, while T2 chose its tcurren value at Time 3. Note also that this schedule is allowed by two-phase locking. This results in serialization order being different from timestamp order, and causes two potential problems. First, a timeslice for  less than y,z greater than  for Time 2, issued at Time 6, returns  less than y=y0,z=z0 greater than  since T2 has a time later than Time 2 and T1 has not yet committed or even accessed z yet. However, the same timeslice (i.e., for Time 2), instead issued at Time 10, returns  less than y=y0,z=14 greater than  because it sees the z written by T1. Second, clearly one of these timeslice results (the later one in this case) is not a transaction consistent view of the database. This is unacceptable.
The anomaly illustrated by Table 1 can be avoided if a transaction X""s timestamp can be established at the time at which the transaction X is committing. At the time the transaction X is committing, the timestamp can be chosen to agree with the commit ordering of the transactions. However, a transaction may be forced to choose a timestamp at an earlier point in the transaction, for example, when the transaction asks for the current time or date. If a transaction asks for the current time or date, the transaction must have a timestamp that corresponds with the current time or date returned by the database management system. When the transaction is forced to choose a timestamp before the transaction is committing, keeping the timestamp order consistent with a valid transaction serialization order becomes a substantial problem. Prior proposals for transaction timestamps have either not addressed early timestamp choice, have not avoided excessively restricting potential concurrency or have resulted in excessive transaction aborts.
One proposed prior timestamping approach is referred to as commit time choice of timestamp. If the transaction is able to delay the choice of the transaction""s timestamp until transaction commit, it is possible to simply choose a timestamp that reflects the order in which the transactions have committed. The transaction timestamp reflects the transactions time of commit. Commit time choice of timestamp requires that something other than timestamp ordering be used for concurrency control. One concurrency control technique that has been proposed is known as two-phase locking. FIG. 2 is a flow chart that generally shows how locking works. For a data item being read, the transaction will determine if there is an active transaction write. The transaction will know whether or not there is an active transaction write, because a write or exclusive mode lock is placed on each data item being written. If there is an active transaction write, the transaction will block or prevent the data item from being read. If there is not an active transaction write on the data item, the transaction will place a read or share mode lock on the data item being read. The read or share mode lock on the data item being read allows the data item to be read by other transactions but does not allow the data item to be written.
When data items are written by a transaction, the transaction determines whether there is an active transaction read or write on the data item being written. The transaction can determine whether there is an active transaction read by determining whether a transaction has placed a read lock on the data item and can determine whether there is an active transaction write by determining whether a transaction has placed a write lock on the data item. If there is an active transaction read or write of the data item, the transaction blocks the write to the data item until there is no longer an active transaction read or write. If there is not an active transaction read or write, the transaction places a write lock or exclusive mode lock on the data item. The write lock or exclusive mode lock prevents the data item being written by the transaction from being read or written by other transactions. Two phase locking means that a transaction acquires all of its locks prior to releasing any of the locks. If all transactions follow this discipline, the result will be a serial schedule of transactions. Most systems follow a more stringent locking policy called strict two phase locking. This policy provides that no lock is released until a transaction is ready to commit. Strict two phase locking guarantees that a transaction can be rolled back should it abort, without triggering the rollback (abort) of any other transaction. In what follows, we assume that strict two phase locking is in use.
Strict two-phase locking has been considered for use in conjunction with commit time choice of timestamp to assure that timestamps agree with the transaction serialization order. FIG. 3 is a flow chart showing the steps involved in commit time choice of timestamp when a request for current time has not been requested by the database management system. FIG. 4 is a flow chart that attempts to use the same general approach involving variable LAST when the current time has been requested. Referring to FIG. 3, variable LAST is maintained by the database and represents the time the last transaction was committed. For each subsequent transaction that attempts to commit the timestamp for the transaction is set to a value greater than the variable LAST. The variable LAST is then updated to the new greatest timestamp.
One problem with commit time choice of timestamp is that it does not permit the database to be responsive to a request by a statement in the transaction for the current time. When a request for the current time occurs, the database cannot simply choose a timestamp at commit. Rather, the timestamp must be selected when the request for current time occurs, so the timestamp assigned to the transaction corresponds to the requested current time. Referring to FIG. 4, commit time choice of timestamp may set the transaction timestamp to LAST when the current time is requested. As each subsequent transaction with a timestamp greater than LAST commits, the variable LAST is set to the new larger timestamp value. Transactions having a timestamp less than the variable LAST are aborted, enforcing the rule that timestamp order agree with serialization order of the transactions. This rule can cause a large number of transactions to be aborted. For example, any transaction having a transaction timestamp set to LAST as a result of a request for current time will be aborted if another transaction commits before the transaction that requested the current time, since the variable LAST will update to the time the last transaction committed.
Another proposed method of assigning timestamps to transactions is known as timestamp order concurrency control. FIG. 5 is a flow chart showing timestamping of transactions in accordance with timestamp order concurrency control. Timestamp order concurrency control forces transactions to commit in the order of the timestamps assigned to the transactions. When the commit of a transaction will violate the order of the transaction timestamps, the transaction is aborted.
Timestamp order concurrency control has existed for about 20 years. Referring to FIG. 5, timestamp order concurrency control chooses the timestamp tX for a transaction at the start of the transaction or at the time the transaction first accesses data. The transaction timestamp is assigned to data items when they are read and written by the transaction. Transactions can access data such that write-write conflicts, write-read conflicts and read-write conflicts are created. Timestamp order concurrency control associates a write timestamp d.TT(referred to as the start time above) and a read timestamp d.TR with each data item d in the database. The write transaction d.TTrefers to the time the data item was written. The read timestamp d.TR refers to the time the data item was last read. Timestamp order concurrency control attempts to minimize aborts resulting from transaction ordering conflicts by maintaining precise timestamp constraints for each transaction and each data item.
In transaction-time databases, the start time or write timestamp d.TTis stored persistently with the data to maintain all previously current states of the database and thus support time slice queries. The stored start values d.TTof the data allow the database management system to handle write-write and write-read conflicts. To deal with read-write conflicts, the time each data item was last read d.TR must also be maintained. Whenever a data item d is read by a transaction X with a timestamp tX that is larger than the data item""s current read time d.TR, the data item""s read time d.TR is set to the transaction timestamp tX. The read time d.TR does not need to be a persistent part of the database, since queries typically do not ask when the data was last read. As such, the timestamp order concurrency control methods can exploit a volatile data structure that captures the read time d.TR for each data item d and periodically clears the read times for data items that have not been written within a fixed time interval.
Referring to FIG. 5, after the transaction is given a timestamp tX at the start of the transaction, a transaction writing data items compares the transaction timestamp tX to the read timestamps d.TR of the data items and compares the transaction timestamp tX to the write timestamps d.TTof the data items. The data item""s read timestamps d.TR are maintained in a volatile structure, while the write timestamps d.TTare maintained in the stable transaction-time database. For each data item being read by the transaction, the transaction timestamp tX is compared to the write timestamp d.TTof the data item being read. If the transaction timestamp tX is less than the write timestamp d.TTthe transaction aborts. A later transaction wrote data item d, so the transaction came too late. If the transaction timestamp tX is greater than the time the data item was written d.TTthe data item d is read by the transaction. If the transaction timestamp tX is greater than the data item read timestamp d.TR, then the transaction sets the read timestamp d.TR for the data item to the transaction timestamp tX.
For data items being written by the transaction, the transaction compares the transaction timestamp tX to both the time the data item was written d.TT(write timestamp) and the time data item was read d.TR. (read timestamp) If the transaction timestamp tX is less than the time the data item was written d.TTor the time the data item was last read d.TR the transaction aborts, because a later transaction read or wrote the data item d. If the transaction timestamp tX is greater than both the time the data item was written d.TTand the time the data item was last read d.T.R, the data item is written by the transaction and the write timestamp of the data item d.TTis set to the transaction timestamp tX.
Timestamp order concurrency control presents at least two problems. The first problem is that each transaction needs to have a timestamp tX when the transaction starts, so that the transaction can use the timestamp tX in the timestamp consistency testing. By forcing the choice of the transaction timestamp at the transaction start, the probability that the timestamp consistency check will fail, resulting in an abort, significantly increases.
A second problem with timestamp order concurrency control is that a potentially enormous number of variable data read timestamps d.TR need to be stored in the database. The access structure for the data read timestamps d.TR needs to provide efficient access for all current data items and needs to be a dynamic structure that can grow in size as more data items are included in it. At some point, the growth of the structure for storing data read timestamps d.TR must be limited so that it can be reasonably maintained in volatile memory.
Timestamp order concurrency control methods have proposed what is known as garbage collection to prune the structure that stores the read timestamps d.TR. One garbage collection technique deletes read timestamps that are older than some arbitrary time xcex4. That is, any read timestamp d.TR that is less than the given time xcex4 subtracted from the current time Tcurrent is deleted. All data items without an explicit read timestamp are implicitly given a read timestamp equal to the result of subtracting the given xcex4 from the current time. If a transaction with a timestamp that is less than Tcurrentxe2x80x94xcex4 writes a data item that does not have a stored read timestamp, the transaction is aborted. The garbage collection method may thus implicitly set the read timestamps d.TR of data items to values that are significantly larger than the previously assigned read timestamps, which may result in additional transaction aborts.
The disclosed system and method concern transaction timestamping in temporal databases. The system and method assigns transaction timestamps that are consistent with the transaction serialization order and minimizes transaction aborts. One aspect of the disclosed system and method assigns acceptable upper and lower bounds for the transaction timestamp before the transaction timestamp is assigned. A transaction timestamp lower limit is set when a data item is read to be the larger of the time the data item was written and the previous timestamp lower limit. When a data item is written, the transaction timestamp lower limit is set to the largest of a time the data item was previously written, a time the data item was last read and the previous timestamp lower limit. A transaction timestamp upper limit is set to an end of a time interval that is requested by the transaction based on the precision of the current time requested by the transaction. The transaction timestamp is assigned a value between the transaction timestamp lower limit and the transaction timestamp upper limit.
The disclosed system and method assigns an upper timestamp limit when it is determined that a current time has been requested by the transaction that has precision that is less than the maximum precision of the transaction timestamp. The upper timestamp limit is the maximum time that can be produced by extending the precision of the current time request. The transaction timestamp lower limit is set as described above. When it is determined that a time interval has not been requested by the transaction, a transaction timestamp is assigned a value that is greater than the timestamp lower limit. When it is determined that a time interval has been requested by the transaction, a transaction timestamp upper limit is set to an end of the time interval requested by the transaction. The transaction timestamp is assigned a value between the transaction timestamp lower limit and the transaction timestamp upper limit.
The disclosed system and method varies the manner in which timestamps are assigned based on a request for current time by a transaction. The disclosed system and method sets an initial transaction timestamp lower limit. During a first phase of transaction execution, the lower limit is updated based on reading and writing of data. If a data item is read, the timestamp lower limit is updated to a time that is at least as large as the time the data was written. If a data item is being written, the timestamp lower limit is updated to a time at least as large as the time the data was previously read or written. A second phase of transaction execution occurs after a request for current time is processed. In the second phase, the upper limit is evaluated based on reading and writing of the data. If data is read, and the upper limit is less than the time the data was last written, the transaction is aborted. If the data is written by the transaction and the upper limit is less than the time the data was previously read or written, the transaction is aborted. The transaction is committed by finalizing the reading and writing of the data to the database unless the transaction aborted during the second phase of transaction execution.
The disclosed system and method assigns read timestamps to sets of data items. The read timestamps assigned to the sets of data items are used to set the transaction timestamp lower limit. When a data item is read, a transaction lower limit is set to the larger of the time the data item was written and a previous transaction timestamp lower limit. When a data item is written, the transaction timestamp lower limit is set to the largest of the time the data item was previously written, a read timestamp assigned to a set of data items that corresponds to the data item being written and the previous timestamp lower limit. The transaction timestamp is set to a value that is greater than the timestamp lower limit.
A table may be used to assign the read timestamps to the corresponding sets of data items. Read times are stored in entries of a table that correspond to sets of data items. A transaction timestamp is assigned to the transaction. A list of all data items read by the transaction is stored. At transaction commit, entries in the table that correspond to each data item read by the transaction are accessed. The read time of each accessed entry of the table is updated to the transaction timestamp when the transaction timestamp is greater than the read time stored in the entry of the table.
In one embodiment, read timestamps are stored in storage locations of a hash table. A hash function determines the storage location of the hash table that is accessed by a given data item. Read timestamps for multiple data items may share a single memory location in the hash table. The memory location is set to the highest read timestamp for the data items that share the memory location. Read timestamps are assigned to sets of data items in this manner. For data items that are written by a transaction, the hash table is accessed to set the transaction timestamp lower limit. For data items read by a transaction, the read timestamps stored in the hash table are updated to the greater of the transaction timestamp and the existing timestamp stored in the storage location of the hash table.