An important function of a database management system (DBMS) is to reconstruct a past sequence of events that occurred in a database either for forensic analysis or to satisfy regulatory requirements. Constructing such events may involve a (DBMS) responding to a temporal query. A temporal query is a query where results may only be computed from one (or more) previous database states. Consider a situation in which a DBMS tracks purchase orders and user roles. A user's role determines the user's authority to perform functions or actions in the database system, such as purchasing orders. At some point a salesman is incorrectly granted authority to approve purchase orders. For example, the salesman role that is stored in a table in the database is incorrectly changed to grant purchase order authority. The salesman then proceeds to approve purchase orders, wherein the DBMS records the purchase orders. At some point, later in time an auditor discovers that the salesman should not have been granted this authority. The auditor may cause the salesman's role to be changed to no longer allow purchase order approval, if this role change has not already been made. The auditor also needs to know all of the purchase orders approved by the salesman with incorrect authority. The information may not be available in the database, in its current state, because the records for the purchase order have been deleted or authorization information in the records has been overwritten to reflect purchase order approval by someone with legitimate authority. To determine what purchase orders were approved by the salesman, the auditor may wish to user a temporal query such as the following.
“Show me the all the purchase orders approved by John Doe during the time that he was a salesman.”
The actual format for the temporal query may be different than the example. The start time for the example temporal query is when the salesman's role became “Salesman.” The end time for the temporal query is when the salesman's role does not include “Salesman.” Thus, the DBMS should return all purchase orders approved by the salesman between the start and end times.
There are other situations where there is a need for a DBMS to respond to temporal queries. For example, an operator may have incorrectly marked an item as “for sale.” In this case, it is desirable to determine all sales of the item recorded by the DBMS that were made with the incorrect price.
Determining an answer for such temporal queries presents numerous challenges. Techniques such as consistent read, flashback query etc. are not viable solutions to the problem. A consistent read or flashback query executes a query as of a specific time in the past. However, the beginning and/or ending time that is pertinent for the temporal query may not be known. For example, using the example of the salesman, the hire date of the particular salesman and the time at which the problem was discovered may be known. However, the time at which the authority was incorrectly granted may not be known. Moreover, the time at which the authority was corrected may not be known.
Further, the response to the temporal query should provide all the results of interest between a start time and an end time. For example, all of the purchase orders that the salesman incorrectly granted are of interest. Therefore, techniques such as consistent read, flashback query etc. are not viable solutions to the problem.
Moreover, a point-in-time recovery of a database to a time in the past is not a viable solution. As previously discussed, the pertinent time may not be known. Also, a point-in-time recovery does not provide information over a range of times.
One conventional solution to the temporal query problem is for an application programmer to maintain a detailed trail of time-stamped changes within the application. For example, the programmer internally converts every update or delete statement to include a pre-processing or post-processing step which logs the before images of the data and a timestamp of when the change was made. This causes additional columns and rows to be created in the database that can be used to answer the temporal query.
Using the above example temporal query, there may be a roles table (“roles”) that has an attribute “role name”, which may have a value of “Salesman” indicating the individual with that role has purchase order authority. Moreover, an orders table (“orders”) may have an attribute “approver_name” whose value indicates the individual who approved the order. In order to answer the temporal query, the application programmer can create two new columns, “when” and “when_deleted” in both tables. The programmer includes the current time in the “when” column as part of an update or an insert statement. Further, the programmer includes the current time in the “when_deleted” column when a row is deleted. (Alternatively, the programmer could change deletion of rows to insertion of a new row with a status column set to “DELETED.”) These changes to the database allow the above example temporal query to be answered by issuing the following SQL statement:
select *
from orders o, roles r
where o.approver_name=“John Doe” and r.name=“John Doe” and                r.role_name=“Salesman” and o.when between r.when and r.when_deleted        
However, such conventional techniques have a number of drawbacks. An extraordinary burden is placed on the application programmer to extend the database schema and correctly maintain a trail of changes to database rows. Performance suffers because all changes are audited by the application. Further, the size of the database tables increases. One reason for the increase in size is that timestamp information cannot be deleted from the table, which means that rows cannot be deleted.
Rather than making the above changes to the application, another conventional approach is to create database triggers that execute code in the database backend to achieve a similar result. Richard T. Snodgrass describes such a method and other related issues in, “Developing Time Oriented Database Applications in SQL.” However, creating database triggers suffers from similar problems as techniques that place a burden on the application programmer.
Many database applications enforce a unique constraint to a table. For example, the table is constrained such that a particular user can only have a single privilege. However, this prevents the database from having multiple rows for a particular user in the same table. Thus, the same table cannot be used to assign two different roles to John Doe, in the example.
A solution to the limitation of the table allowing only one role per user is to create a separate history table. This penalty can be mitigated by migrating “historical” rows to a different partition. However, the schema designer is still faced with the burden of partitioning the data.
Another solution to the limitation of the table allowing only one role per user is to change the constraints to add a timestamp, which is a difficult process. However, applying this constraint involves applying numerous rules, which are enforced by substantial modifications to SQL statements. The modifications to the SQL adds substantial overhead and complexity to the application or the DBMS, if such functionality is built-in to the DBMS.
The penalties associated with the foregoing techniques are incurred even though the temporal queries are issued infrequently. For example, the temporal queries may only occur as part of an investigation during an audit. Further, the query may be issued against a small subset of the database objects, whereas the penalty is incurred on changes to all objects.
Thus, there is a need for a DBMS to respond to temporal queries. There is a need for techniques that do not place a heavy burden on application programmers or that require substantial changes to the database backend. There is a further need for a technique that does not cause a severe performance penalty to normal operation of the database. There is a still further need for a technique that does not require substantial increase to the database size.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which: