Conventional database relations contain sets of records. Insertions, updates and deletions of these records occur over time. In many applications, such as financial, banking, insurance, and medical applications it is desirable to maintain relations that contain not only the current set of records, but also a set of records that was current at any given past time. These relations are maintained in a form of a temporal database called a transaction-time database. Such a database supports queries of the form “give an answer AS OF some time in the past.” In order to answer queries this type, the database needs to keep track of when changes occur in the database. This can be done by keeping track of the time at which transactions update individual records by timestamping the records. Each record version contains not only the record identification, but the timestamp of the transaction that caused the update that produced the version in question. “AS OF” queries for a time t are satisfied by searching for the versions of records with timestamps that are the largest timestamp for the record that is less than or equal to time t.
One existing approach to choosing timestamps is to delay timestamp choice until commit time. Using this approach, the time of commit can be used as the timestamp for the committing transaction. Since the commit order is consistent with transaction serialization order, the timestamp order will likewise be consistent with a serialization. One drawback of commit time timestamping is that commit time timestamping does not support transactions that request CURRENT TIME. This is because requests for CURRENT TIME can require that a timestamp be assigned that is earlier than commit time.
One solution that supports CURRENT TIME requests is disclosed in U.S. patent application Ser. No. 09/939,180 (hereinafter the '180 application), filed on Aug. 24, 2001, invented by David Lomet, assigned to the assignee of the present invention, and is incorporated herein by reference in its entirety. In the method disclosed by the '180 application, timestamps of data accessed during transaction execution are used to determine a lower bound on when a transaction could commit. An upper bound was derived from the CURRENT TIME request. So long as the interval between the lower bound and the upper bound is non-vacuous, the transaction could be committed and assigned a timestamp in the interval.
The solution disclosed by the '180 application is suited for the case where all data exists within a transaction time database. The time of last write for a record is stored as a timestamp in the record itself. This timestamp serves to identify the version of a record to be returned in a query that asks for data as of some particular time. The timestamp in the record enables the system to ensure that write-write and write-read conflicts were correctly reflected in the order of the timestamps, by requiring that the current transaction have a timestamp greater than the record's timestamp.
The '180 application also proposes keeping track of the last time that a record is read so that timestamps can be ordered consistent with read-write conflicts. The '180 application proposed a “read-timestamp table” (RTT) for this purpose. The '180 application suggested that the RTT did not have to retain information about every record, but could identify record classes by means of a hash function, used to index the table.
Existing solutions do not deal with situations where not all data resides in a transaction-time database. A solution is needed for timestamping that can work when ordinary data exists, and the ordinary data may or may not be accessed with the transaction time database data in the same transaction. Prior timestamping techniques also do not enforce timestamp order when locking detects phantoms that would result were a record to be written into ranges accessed by range queries. Prior timestamping techniques assign timestamps to individual records. However, assigning timestamps to individual records does not prevent a subsequent record from being written with a timestamp that is inconsistent with a previous range query.
There is a need for a timestamping technique that accommodates databases that include both temporal and non-temporal data. There is also a need for a timestamping technique that correctly orders timestamps when insertions conflict with range queries. There is also a need for such a timestamping technique that minimizes the overhead while correctly providing timestamps to transactions.