A transaction-time database enables users to inquire about the state of data contained in the database over a period of time—each data element (or record) in such a database is associated with a timestamp. To this end, it is important to maintain the serialization order of transactions contained in such a database, so that users can make database queries using user-sensible time values. For example, if a transaction A serialized earlier than a transaction B, then the timestamp of transaction A and records associated with transaction A must be less then the timestamp of transaction B and data associated with transaction B.
One problem with timestamping records related to a transaction of a transaction-time database is that the transaction's timestamp is usually unknown when the transaction is updating a record. Therefore, in order to ensure that all records updated by a transaction have been timestamped consistently with the serialization order of the transaction, conventional techniques select timestamps for records lazily, i.e., as late as possible in the execution of the transaction. Ideally, conventional techniques lazily select a timestamp when a transaction commits (i.e., when the transaction has completed and is persistent (e.g., saved to disk)). In this case, such techniques select the time of commit as the timestamp associated with the committed transaction.
In order to ensure that all records updated by a transaction have been timestamped after the transaction commits, timestamp information for the transaction must be made persistent until all records of the transaction have been timestamped. To this end, conventional techniques utilize a persistent transaction identifier (ID) timestamp table, which maps an ID of a transaction to a timestamp of the transaction. First, a transaction ID is assigned to a transaction when the transaction begins. Second, the transaction ID is stored in a field of a record the transaction is updating—this field will eventually be used to store a timestamp.
Third, the transaction ID is stored in a transaction ID timestamp table, usually at the time the transaction commits and when the time for its timestamp is known. Once the transaction commits, a transaction ID is mapped to the timestamp utilizing the transaction ID timestamp table. Fourth, in a subsequent transaction that accesses the record previously modified and lacking timestamp information, the record's field, previously containing a transaction ID, is replaced with the timestamp retrieved from the transaction ID timestamp table (where the transaction ID was mapped to the timestamp). Finally, once all records of the transaction have been updated with the timestamp of the transaction, information about the transaction can be deleted (garbage collected) from the transaction ID timestamp table.
One concern with conventional techniques that lazily select timestamps is that updating a persistent transaction ID timestamp table during every transaction can consume a large percentage overhead, especially during short transactions with only one or few updates. Another concern with conventional techniques that lazily select timestamps is the cost of maintaining a persistent transaction ID timestamp table—adding/removing elements to/from a persistent transaction ID timestamp table consumes valuable processing resources. Yet another concern with conventional techniques that lazily select timestamps is efficiently guaranteeing that all records of a transaction have been stored (e.g., written to disk) before information related to the transaction is deleted from a persistent transaction ID timestamp table. Low cost lazy timestamping methods do not treat timestamping as a recoverable update activity, so ensuring that the timestamped records are durable (persistent), should the system crash, needs to be guaranteed in some way that does not require a recovery log.
It is therefore desirable to have timestamping systems and methods that (1) decrease the timestamping cost, especially during short transactions, to update the transaction ID timestamp table; (2) decrease the cost of maintaining a persistent transaction ID timestamp table; and (3) efficiently guarantee in a very prompt and timely manner that all timestamped records of a transaction have been stored durably before information related to the transaction is deleted from a persistent transaction ID timestamp table.