This invention relates to the field of audit trail storage and recovery and has particular application to systems of auditing databases.
Large or rapidly accessed database performance in real time has become a business tool of necessity in communications, electronic commerce, and as support for processes in many other forms of commerce. Thus, the ability to recover quickly from a system or partial system failure has become a weak link in the chain of support for computing and communications systems which run the data bases to support commerce and communications. The importance of quick recovery is underscored by the fact that many systems have been designed to allow operations during recovery by an audit system. Such a system is described in U.S. Pat. No. 5,734,817, issued to Roffe et al., and its disclosure incorporated herein in its entirety by this reference.
Currently, many database servers have tape storage audit trails, and the tape systems are typically running very quickly, say, filling a tape in 12 minutes or less, to accommodate large amounts of data needed for recovery. The records are typically stored in xe2x80x9caudit blocksxe2x80x9d of fixed or variable length, depending on the system, and several thousand of them can be stored per second. The tape and other audit systems will also typically have system status saves stored on regular intervals selected by the audit program or audit program manager. These records may be stored in the form of audit blocks, (and we call such blocks xe2x80x9cP-Savesxe2x80x9d for xe2x80x9cperiodic savesxe2x80x9d of system data for the remainder of this document).
Thus, it is required to search through the audit blocks in order to find from where to begin the reconstruction process so that a database can be restored to its state prior to the crash. This restoration will cause any records which may have been opened, or opened and partially operated upon, or opened, partially operated upon left not closed, to be set to an appropriate state or any exceptions issued where necessary. (A transaction process that is completed is sometimes called a completed xe2x80x9cstepxe2x80x9d. A step is a term that will be used frequently herein each of these operations would be viewed as such a xe2x80x9cstepxe2x80x9d. The importance of this term will become apparent within.) It is easy to understand that such reconstruction and restoration are critical functions in financial transactional databases. Thus, for a bank or other commercial operation to be unable to accomplish such restoration work very quickly is anathema to their business success since all operations of such a compromised database should be put on hold until the recovery is complete. To do otherwise would be to risk the credibility of the data integrity in the whole system, and the business (or other operations) which depends on such records being accurate.
The tape storage systems which contain most modem audit data are typically optimized to run forwards while making recordings at a rapid continuous rate, and consequently, actually operating them to recover from a failure instead requires them to run in non optimized modes, introducing delays in recovery time which can be hours long. Part of the delay is introduced in reading each audit data block, determining what is in it, then backing up (reversing) the tape to the next previous block, reading it, determining what is in it, and so on, until all open items or incomplete transactions are discovered. Only then can a reasonable system proceed to read the entire tape forward to once again read the audit blocks and reconstruct the activity occurring at the time of the failure so that the failure can be corrected or appropriately accommodated. Such wait times before beginning recovery can be extremely significant, risking the real time commercial or communications activities for which the databases are used. Also, positioning near the end of the tape may be time consuming due to the size of the latest tape storage systems. However, positioning near the end of the tape for many of these latest tape storage systems is required to start a search for all activity in progress at the time of the system failure.
To illustrate by way of a few examples, consider the functioning of a large widely distributed Automatic Teller Machine network, or a major airline reservation system, or a check clearing operation. If the system has a failure which requires a shutdown for 6 hours to recover, that would be catastrophic to the business operations of the teller network, the airline or the check clearing system. Also, in systems that may require relatively frequent transfers of large pieces of data, such as video records, the time frame in which a particular record is open or being transferred is potentially much larger. Thus, in such systems also, recovery would require resort to many blocks of storage in an audit trail to discover the point in the audit record at which the recovery process should begin. (The restoration and recovery processes are commonly performed by another automatic system commonly called an executive or recovery executive program, the details of which are not required for an understanding of the instant invention. Unisys sells such programs under the name IRU or Integrated Recovery Utility, currently.)
Mass storage, or disk storage, may be significantly faster in recovery time cost, because paging through the audit trail to find records of incomplete or corrupted transactions will be quicker than with tape. Nevertheless, mass storage systems still introduce significant lag time in the piecing together of the audit record blocks which are relevant to the particular problems outstanding in a failure, because they require many individual seeks and reads to find the relevant starting point for recovery. In a large data base which turns over data many times in a short period or which has extended periods during which a particular record may be operated on, the thousands of audit records to be reviewed to find the initial activation/opening/call to/writing of a particular record can thus still take an unacceptably significant amount of time. Further, the current cost of disk storage is far higher than tape storage. Also, the time and effort required for transferring the data to a longer term storage type, for example tape media, should also be factored in.
In the context of using mass storage for audit trail information, U.S. Pat. No. 5,561,795 (Sarkar), incorporated herein by this reference, describes a system of keeping a time for the beginning of a transaction (that is, one affecting each of several cached pages of a database) and storing it updated every time the cache audit trail is being written to the non-volatile (mass) storage. Sarkar requires a transaction control table from which the oldest uncommitted transaction can be found. In a system with thousands of on-going transactions, therefore each one would require entries in this table in order for it to be useful in establishing an audit trail in accord with Sarkar""s invention.
Myre Jr., et al in U.S. Pat. No. 5,043,806 also incorporated herein by this reference, is cited by Sarkar as prior art. Myre et al., stored a periodically determined earliest uncompleted transaction Log Sequence Number (LSN) and an earliest uncommitted transaction LSN. This, in turn was an improvement of the art before Myre which merely stored all uncompleted transition and uncommitted transaction LSN""s. Myre, like Sarkar, required reference to a transaction table to determine the earliest LSN""s of relevance. Both would store the earliest open transaction during the equivalent P-SAVE operations in creating the audit trail. What is problematic about that is that the time and tape or mass storage locations of the earliest open transaction (or step) has to be saved, and therefore kept in the transaction table, wasting a great deal of main memory real estate. This in turn lowers the ability of the whole computer system to function, relative to the required storage size and makes audit trails expensive.
Therefore, in the absence of such constant record keeping which requires significant main memory resources, a Myre-like system would require finding every periodic save (P-SAVE) of the records of open transactions (or steps) until all the steps initiation points can be accounted for on the audit trail prior to doing a recovery. Especially in taped-based audit systems, reading backwards is extremely time intensive to recovery without a Sarkar-like system and can become commercially unviable.
Accordingly there is great commercial need for a method or system to speed up examination of audit record blocks preparatory to restoration of a database to a fully operative condition, and at the same time does not require much if any storage area in main memory or processing overhead to implement.
Further, it has remained extremely economical, relative to other forms of mass storage, to employ magnetic tape drives for storing large amounts of data that may only occasionally be used. The current tape drive technology can record very quickly, but readback in the direction opposite from recording is very slow due to their optimization for fast recording. Accordingly, an audit trail recording system adapted specifically and optimized for such tape systems are another unmet commercial need.