In general, computer systems are subject to hardware and software malfunctions, power outages, and other unforeseen failures that may lead to data loss or corruption. As a result, numerous recovery techniques have been devised to reduce the impact of such a failure. Often, these techniques allow a computer system to reconstruct or otherwise recover much of the data after the failure. These techniques may be employed in a variety of applications and environments, such as financial, inventory management, point-of-sale, and travel reservation systems.
One common data recovery and restoration technique is an audit trail. In general, an audit trail is a mechanism that records sequential, time-related system event records in an associated audit trail file. For example, a database may utilize an associated audit trail to record changes to the database. As another example, an audit trail may be used to record sequential, time-dependent performance or event records within a transaction-processing environment. As another example, a common sequential audit trail is a system log audit trail that captures all computer access events in time order.
The audit trail files may be accessed for a variety of reasons. For example, in the event of database or system failure, a database can be re-constructed from the audit trail. As another example, the audit trail file may be accessed to generate report detailing user access or program error information.
Often, the audit trail is stored within a respective audit file that resides on a magnetic tape or disc drive storage medium. The audit file typically takes the form of sequential audit blocks. More specifically, for a given audit trail, audit trail data may be accumulated in a temporary storage location to form a single audit block. Upon buffering enough audit trail data, the audit block is written to the audit file in one transfer, rather than the many transfers that would otherwise have been required to write each individual record. The audit blocks are sequentially written to the audit file, and include timestamps and trail block sequence numbers (TBSNs) that indicate the order in which the events occurred.
If the computer system fails, a recovery utility may retrieve the audit blocks from the audit file and invoke one of a variety of recovery mechanisms. For example, the recovery utility may “replay” one or more of the recorded changes to restore the system to the state prior to the failure. Alternatively, the recovery utility may utilize the audit blocks to rollback the data to a previous state.
One challenge that arises when using an audit trail is the potential inefficiency of writing audit trail data for each transaction. In particular, inefficiencies may arise due to the time required to write each block. Each transfer of an audit block to the audit trail requires time for accessing the storage medium containing the respective audit file, as well as time for actually writing the audit trail data. This cumulative time may represent an overall significant impact in system response to end users in a high-volume transaction processing environment, e.g., a transaction processing computing system.