Databases are used to store a variety of information. For example, databases may track financial transactions, errors occurring in a computer system, and changes to a document file. These databases are often essential to the operation of a company. Thus, the databases must store the information, such that the information is quickly accessible on demand. Further, the databases must be resilient to errors, such that the databases can be quickly recovered should an error occur. One conventional technique for decreasing the down time of a database during recovery is to track changes to the database through an audit trail. When entries are changed in the database or new transactions entered to the database, the information may be stored as records in the audit trail.
Step IDs uniquely identify a single transaction or a series of transactions operating on the database. Step IDs may be used to track the progress of the transactions through database processing and maintain atomicity of the transactions. FIG. 1 illustrates a conventional step ID. Conventionally, a step ID 100 is a value comprising a combination of a first portion 102a that is time based, which may include an uniqueness sequence 102b within the time, and a second portion 104 that is an advance number. The advance number allows a user to commit a portion of transaction processing and resume processing, while retaining resources. The step ID structure of FIG. 1 may create problems in certain situations.
For example, the time-based portion 102a of the step ID 100 is a local time of the computer executing the transaction in the database. Changes in the local time, such as daylight saving time, can affect the accuracy of the step IDs. For example, if a database continues to generate step IDs during a time adjustment backwards, duplicate step IDs may be generated. That is, during a “fall back” of daylight saving time, duplicate step IDs may be generated for one hour. One conventional solution to this problem is to shut down the database for one hour while the clock catches up. However, because the databases are often critical to the operation of a company, shutting down the database is undesirable.
Databases may allow steps to commit to the database or roll back work in progress while continuing under the same step ID. Continuing use of the step ID, instead of starting a new step ID, improves performance by retaining resources acquired by the step. The advance portion 104 of the step ID may be used when a step commits the previous work and continues new work. That is, the advance portion 104 allows the database to separate transactions within a single step ID. Additionally, steps that roll back and continue to operate may retain the existing step ID. However, the continued use of an existing step ID can affect the database operations.
For example, if the advance portion 104 is incremented beyond a maximum value, such as 4095, the advance portion 104 will wrap-around and restart at zero. When the advance portion 104 wraps around, the database will have different transactions with duplicate identifiers. Duplicate identifiers cause data corruption during recovery.
In another example, if the database fails and a step ID is continued after a roll back in a commit-in-progress state, recovery could pick up and apply updates from the rolled back portion of that step ID, which causes data corruption. FIG. 2 is a flow chart illustrating a conventional method of recovering a database after a crash. At block 202, a last periodic save record (PSR) for the database is identified when recovery starts. At block 204, the database steps backwards through periodic save records to reach a periodic save record where none of the step IDs associated with the periodic save record is on a commit-in-progress list. At block 206, the database applies the updates in the audit trail on the commit-in-progress list to recover the database.
However, the conventional method of recovery shown in FIG. 2 may not always function correctly. When the advance number of the step ID is not incremented during a roll back operation, data may become corrupted during the recovery of FIG. 2. The data corruption occurs because if any of the steps on the commit-in-progress list have done a previous roll back, block 206 will apply updates done both before and after the rollback.