1. Field of the Invention
The invention relates generally to database systems and more particularly to techniques by which a database system can automatically maintain a history of the changes made in a table belonging to the database system.
2. Description of Related Technology
The technology that is related to the disclosed database system that provides for history-enabled tables falls into two broad categories:                Techniques for keeping track of insertions, updates, and deletions so that errors occurring during operation of the database system may be corrected; these techniques then form the basis of techniques for determining the past state of records and transactions in the database system; and        Techniques for dealing with time information in database tables.        
The techniques relative to these categories are explained in the following.
Keeping Track of Insertions, Updates, and Deletions in Database Systems
Most database tables contain only currently-valid information; when a row in the database table is updated or deleted, the information contained in the row prior to its modification or deletion is lost. It soon became apparent to database users that keeping the information that was discarded in the update or deletion was worthwhile. To begin with, the reason for keeping the information was to restore the original information if the update or deletion had been erroneously made. Possible sources of errors included the humans who were entering the data or administering the database system, bugs in queries and programs being executed in the database system, and transactions which failed before they could be completed and therefore had to be rolled back. A transaction in the present context is a sequence of database operations which the database system treats as a single unit: if all of the operations in the sequence are not completed, the transaction is rolled back by undoing all of the operations that did complete. When all of the operations have been undone, the database has been restored to the state it was in before the failed transaction took place with regard to the failed transaction. If the conditions that caused the transaction to fail have been eliminated, the transaction can then be redone. The database system maintained a redo log in which it kept a record of every change made in the database system; the redo log thus contained the information needed to correct mistakes or redo transactions. The only limitation on the redo log for correcting mistakes or redoing transactions was the amount of storage available in the database system for the redo log: the database system treated the redo log's storage as a circular buffer; when the buffer was full, the database system continued to write the redo log by overwriting the oldest entries in the log.
Early database systems allowed only one user to access them at a time; modem database systems may be accessed by hundreds of users at once. One consequence of this is that transactions for a number of users may be accessing the same database record at the same time. If everyone who is accessing the database record is simply reading the database record, such concurrent access presents no problem, but if some are reading the record and others are modifying the record, inconsistencies may result. For example, in a read-only transaction by a first user, the record should not change during the transaction, i.e., a read at the beginning of the transaction and a read at the end should have the same results. However, if another transaction by a second user changes the record during the first transaction, the two reads will not have the same result. One way of keeping this from happening is to use the copy of the record to be read in the redo log for both the first and second read. Database systems manufactured by Oracle Corporation, of Redwood City Calif., have long used this technique; recently, the SQL Server database system manufactured by Microsoft Corporation has begun employing a technique in which the version of a record that exists at the beginning of a transaction is maintained until the transaction is finished.
Eventually, designers at Oracle Corporation realized that the redo log was valuable not only to deal with errors and concurrency problems, but also as a source of historical information about the tables in the database system. Because the redo log had a record for every change made in the database system, it could be mined to find out what a table had looked like at a particular point in the past or to obtain a sequence of the changes made over time with regard to a single entity in the database system. In 2003, Oracle Corporation introduced a utility for reading the history of information in the database system from the redo log. This utility, termed Flashback, permitted users to query the redo log as if they were querying tables in the database system. The user specified a time in a query and Flashback reconstructed a snapshot of the tables in the query as they were at the specified time from the redo log and then performed the query on the reconstructed tables. The information from the query could be used to restore a table to a previous state or simply to see what the table looked like at the specified time. The user could also specify two times, and Flashback returned records as they had changed between the times. Of course, as with everything else that uses information in the redo log, Flashback can go no further back than the oldest available portion of the redo log. Another consequence of reconstructing the tables from the information in the redo log is that the further back into the redo log the database system has to go to reconstruct the table, the longer the reconstruction takes.
Techniques for Dealing with Time in Database Tables
There are of course many situations in which a user will include time information in a database table. A systematic discussion of the ways in which this may be done and of the difficulties that SQL, the standard language used to write queries in relational database systems, has in expressing queries involving time information may be found in Richard T. Snodgrass, Developing Time-oriented Database Applications in SQL, Morgan-Kaufmann Publishers, San Francisco, USA, 2000. Useful terminology from the Snodgrass book includes the following:
There are three fundamental temporal datatypes:                Instant: something happened at an instant of time (e.g., now, Jul. 18, 2005, when this is being written, or sometime, perhaps much later, when it is being read)        Interval: a length of time (e.g., three months)        Period: an anchored duration of time (e.g., the fall semester, Aug. 24 through Dec. 18, 1998)        
There are three fundamental kinds of time.                User-defined time: an uninterpreted time value        Valid time: when a fact was true in the reality being modeled in the table        Transaction time: when a fact was stored in the database        
These kinds of time are orthogonal: a table can be associated with none, one, two, or even all three kinds of time. Snodgrass terms a table which is associated with valid time a valid-time state table; he terms a table which is associated with transaction time a transaction-time state table; he terms a table which is associated with both kinds of time a bitemporal table. Transaction-time state tables have the property that they can be reconstructed as of a previous date. Valid time state tables and bitemporal tables permit queries involving specific points in time and periods of time. Such queries are termed in the following temporal queries. Examples are a query to determine what versions of the table's rows were in the table as of a given date and a query to determine what versions of the table's rows were in the table during a given period of time.
In Snodgrass' examples, the transaction-time state for a table is simply incorporated into the table; a paper by Tal Kelly, Using Triggers to track database action history from the year 2001, which was found in July 2005 at developerfusion.com/scripts/print. aspx?id=2413, describes a technique for associating a history database table with a primary table. A row is inserted in the history table whenever a row is inserted into the primary table or an existing row in the primary table is updated. The history table has columns that are equivalent to those in the primary table and has additional columns that indicate the time at which the row in the primary table was inserted or updated and the operation in the primary table that resulted in the row being inserted in the history table. When a row is inserted in the primary table, the row inserted into the history table includes the data from the primary table row, the time the row was inserted in the primary table, and indicates that the operation was “insert”. When a row is updated in the primary table, the row that is inserted in the history table has the data from the primary table row as it was before the update, the time the row was updated in the primary table, and indicates that the operation was “update”.
The rows are inserted into the history table by triggers, that is, user-written code that is automatically executed by the database system when certain events occur with regard to a table. Two of the events which may result in the execution of a trigger are the insertion of a row and the update of a row; thus, an insertion of a row in the primary table results in an execution of an insert trigger that creates the row corresponding to the insertion operation in the history table; similarly, the update of a row results in an execution of an update trigger that creates the row corresponding to the update operation.
It should be pointed out here that the history table is an example of a transaction-time state table, albeit one that is rather hard to use, because Kelley's history table entry only specifies when the operation on the primary table that resulted in the creation of the row in the history table creation was performed and because the meaning of the time value depends on the operation on the primary table that caused the history table row to be created: in the case of an insertion, the time value indicates when the corresponding row in the primary table began existing; in the case of an update, the time value indicates when the corresponding row in the primary table ceased existing in the form specified in the history table row. Thus, using Kelley's history table to figure out the time period during which a given row of the history table existed in the primary table is a complex and expensive operation.
As can be seen from the foregoing, currently-available techniques for keeping track of the history of a file in a relational database system have their drawbacks: Flashback is easy to use but requires a relational database system that keeps a redo log and is limited by the redo log: if the information for the table to be reconstructed is no longer in the redo log, Flashback cannot reconstruct the table; further, the time it takes to reconstruct the table is determined by how much of the redo log Flashback has to read to obtain the information necessary to reconstruct the table. It is of course possible for users of database systems to implement their own arrangements for keeping track of the history of tables of interest, but as the Snodgrass book demonstrates, more than ordinary expertise in SQL is required to properly construct and use arrangements for keeping track of the history of tables, and the Kelley reference serves as an illustration both of the required programming expertise and of some of the pitfalls involved in making one's own arrangements to keep track of the history of a table. What is needed is a technique for keeping track of the history of individual tables of interest which is as easy to use as Flashback but does not depend on the redo log and consequently is not limited by the amount of redo log available and does not require reading the redo log. It is an object of the invention disclosed herein to provide such a technique.