1. Technical Field
The present invention relates to the field of database management systems generally and, more particularly, to method and apparatus for detecting and recovering from data corruption of a database by codewording regions of the database and by logging information about reads of the database.
2. Description of the Related Arts
A database is a collection of data organized usefully and fundamental to many software applications. The database is associated with a database manager and together with its software application comprises a database management system (DBMS). In recent years, extensible database systems such as illustrate (now part of the Informix Universal Server) have been developed which allow the integration of application code with database system code. In these systems, the application code has direct access to the buffer cache and other internal structures of the DBMS. Similarly, application programs in many object oriented database (OODB) systems have direct access to an object cache in their address space. This OODB architecture was developed to minimize the cost of accesses to data, for example, to support the needs of Computer Aided Design (CAD) systems. Also, several recently developed storage management systems provide memory resident or memory mapped architectures. For example, the Dali main-memory storage manager described in Bohannon et al., xe2x80x9cThe Architecture of the Dali Main-Memory Storage Manager,xe2x80x9d Journal of Multimedia Tools and Applications, 4(2), pp. 115-151 (1997) is designed to provide applications with fast, direct access to data by keeping the entire database in volatile main memory. In all these systems, direct access to data (either in the database buffer cache or in a memory-mapped portion of the database) by application programs is critical to providing fast response times. The alternative to memory mapping is to access data via a server process, but this presents an unacceptable solution due to the high cost of inter-process communication. Application code is typically less trustworthy than database system code, and there is therefore a significant risk that xe2x80x9cwild writesxe2x80x9d and other programming errors can affect persistent data in systems that allow applications to access such data directly. Since the systems described above are increasingly popular, the risk of wild writes and associated physical corruption is growing. Additionally, there is a risk of damage due to software faults in the DBMS itself. It is therefore important to develop techniques that can mitigate the risk of corruption.
In our parent U.S. patent application Ser. No. 08/66,096, filed Dec. 16, 1996 and entitled xe2x80x9cSystem and Method for Restoring a Distributed Checkpointed Database,xe2x80x9d we describe the application of multiple checkpoints and the maintenance of a stable log record stored on a server for tracking operations to be made to the multiple checkpoints in a distributed environment. A companion parent application, U.S. patent application Ser. No. 08/67,048, entitled xe2x80x9cSystem and Method for Restoring a Multiple Checkpointed Database in View of Loss of Volatile Memoryxe2x80x9d filed the same day describes recovery processes at multiple levels of a DBMS in the event of loss of volatile memory. The ""048 and the ""096 applications should be deemed to be incorporated by reference herein as to their entire contents. Both of these applications relate to the preservation and restoration of a database (or distributed database), for example, stored in main volatile memory of a data processor.
The problem of detecting and recovering from corruption of data in a database system still remains to be solved in a pragmatic manner without adding considerable overhead to the DBMS. Data corruption may be physical or logical and it may be direct or indirect. Data is xe2x80x9cdirectlyxe2x80x9d corrupted in a physical corruption sense by xe2x80x9cunintendedxe2x80x9d updates, such as wild writes as explained above due to programming errors in the physical case, or arising from incorrectly coded updates or input errors (human errors) in the logical case. Once data is directly corrupted, it may be read by a process, which then issues writes based on the value read. Data written in this manner is indirectly corrupted, and the process involved is said to have carried the corruption. While this process may be a database maintenance process, we focus on transaction-carried corruption, a problem in which the carrying process is executing transactions.
Direct physical corruption can be mostly prevented with hardware memory protection, using the virtual memory support provided by most operating systems. One approach involves mapping the entire database in a protected mode, and selectively un-protecting and re-protecting pages as they are updated. However, this can be very expensive, for example, on standard UNIX systems. An alternative to the hardware approach would be programming language techniques such as type-safe languages or sandboxing. (Sandboxing is a technique whereby an assembly language programmer adds code immediately before a write to ensure that the instruction is not affecting protected space.) However, type-safe languages have yet to be proven in high-performance situations, and sandboxing may perform poorly on certain architectures. Finally, communication across process domain boundaries to a database server process provides protection, but such communication is orders of magnitude slower than access in the same process space, even with highly tuned implementations. The concern over physical corruption is further motivated by the increasing number of systems in which application code has direct access to system buffers, including extensible systems, object databases, and memory-mapped or in-memory architectures. Finally, some work has raised concern over damage to data due to faults in the DBMS itself.
Integrity constraints are widely studied and prevent certain cases of logical corruption in which rules about the data would be violated. However, it is an object of the present invention to deal with those cases in which integrity constraints and other input validation techniques fail, and whether due to programming error or invalid input, unintended updates are made to the database. We consider such cases inherently impossible to prevent, and instead assume that the problem is detected later, usually when a database user notices incorrect output (on a bank statement, for example).
In the field of accounting systems and audits, it is known from L. A. Bjork, Jr., xe2x80x9cGeneralized Audit Trail Requirements and Concepts for Data Base Applicationsxe2x80x9d, IBM Systems Journal, No. 3, 1975, pp. 229-245, how to create and maintain an audit trailxe2x80x94a history of activities by transaction, posted because of operations on specific data. Bjork describes that a time dimension can be added to a stored record such that supplemental information in the form of a descriptor and time frame are maintained. For example, the time frame when the information was created and stored, and each version of a data field, is maintained along with the action. He refers to xe2x80x9ccreatexe2x80x9d (creation of data), xe2x80x9creferencexe2x80x9d (when reference is made to x at time t), and update (when created data is updated) as descriptors all having time dimension. A further descriptor is xe2x80x9crefer to prior generationxe2x80x9d (when data now updated is referred to by a prior generation.xe2x80x9d Also, C. T. Davies, Jr. in his article xe2x80x9cData Processing Spheres of Controlxe2x80x9d appearing in the IBM Systems Journal, Vol. 17, No. 2, 1978, pp. 179-198 describes xe2x80x9cin-process recoveryxe2x80x9d as the control of recording and subsequent use of data required to return to a previous point in a process, and that the process that created or last modified data elements be determined from a journal. System recovery is obtained via establishing checkpoints that represent an early state in a data base. Once a search backward is conducted to find an error, a checkpoint behind the error is obtained from which to rebuild. These accounting processes, for example, typified by the recovery from a payroll error, are not examined by Bjork or Davies for the generic case of database recovery, nor are they automated. Bjork and Davies provide no suggestions for real-time implementation, for example, of a read-logging recovery system such as would be required in a communications system and associated record-keeping environment.
Thus, there appears a genuine need in the art of database management systems to provide an improved method and apparatus for detecting and recovering from corruption of a database via read logging.
According to the present invention, it is a principle to apply several new techniques for the prevention or detection of corruption. The new techniques may be suitable for application in a real-time environment such as in a telecommunications system.
For detecting indirect logical or physical corruption, it is a feature of the present invention to log information about reads (Read Logging). Interestingly, any negative impact of Read Logging is limited, as the actual values read are not logged according to one embodiment of the present invention, just the identity of the item read and optionally a checksum of the value. Moreover, it is an extension of the present invention to store codewords for each read of the read log records.
When corruption is detected rather than prevented, techniques for corruption recovery are employed to restore the database to an uncorrupted state. As will be further described herein, once codewording of data and read logging is performed, models and algorithms are presented for recovery from transaction-carried indirect corruption. In these models, the read log records are preferably combined with known write log records and operated on more efficiently as a combined log to detect and recover from transaction based corruption (although in other embodiments of the present invention, the read log and write log records may be separately maintained.) One model, the redo-transaction model, uses logical descriptions of transactions to repair the database state. A second model, the delete-transaction model, focuses on removing the effects of corruption from the database image. The algorithms presented herein can be applied to recovery from logical or physical corruption. In addition, tracing techniques are presented which aid in determining the scope of logical corruption.
To ascertain the performance of our algorithms for detecting and recovering from physical corruption, we have studied the impact of these schemes on a TPC-B style workload implemented in the Dali main-memory storage manager. Our goal was to evaluate the relative impact on normal processing of schemes that can be easily ported across a variety of architectures and operating systems. In addition to our schemes, we study a hardware-based protection technique. For detection of direct corruption, the overheads imposed cause throughput of update transactions to be decreased by 8%. Prevention of transaction-carried corruption with Read Prechecking costs between 12% and 72%, but requires a significant space overhead to achieve the better performance numbers. Detection of transaction-carried corruption with Read Logging costs between 17% and 22%. Our study indicates that the corruption prevention algorithms of Sullivan et al., xe2x80x9cUsing Write Protected Data Structures to Improve Fault Tolerance in Highly Available DBMSxe2x80x9d in Proceedings of the International Conference on Very Large Databases, pp 171-179, 1991, when using standard OS support for memory protection, decrease throughput by about 38%. Thus, the codeword and read logging based detection and prevention schemes of the present invention perform significantly better than the hardware-based protection.
With the present invention, it is possible to identify a subset of the later transactions that were (directly or indirectly) affected by the error, and to selectively roll them back and redo them manually (or even automatically in some cases). Also, the techniques of the present invention are language and instruction-set independent.
Thus, a method of detecting and recovering from data corruption of a database according to the present invention is characterized by the step of logging information about reads of a database to detect and recover from physical corruption of the data in the database, wherein the physical corruption arises from bad writes of data to the database or arises indirectly from the bad writes. In a delete transaction model, the corruption recovery comprises first and second phases, a first redo phase followed by an undo phase. In another embodiment, a method of detecting and recovering from database corruption comprises the steps of logging information about reads of a database to detect and recover from logical corruption, maintaining a logical redo log and storing user inputs for a transaction in the logical redo log. Alternatively, in the logical corruption recovery method, the method comprises the step of logging a checksum. of a logical state found.
In our co-pending, concurrently filed patent application, we describe a Read Prechecking scheme that associates one word codewords with each region of data, and prevents transaction-carried corruption by verifying that the codeword matches the data each time it is read. A Data Codeword scheme, a less expensive variant of Read Prechecking, allows detection of direct physical corruption by asynchronously auditing the codewords. This scheme is also referred to herein and in our co-pending application as deferred codeword maintenance and involves performing codeword updates during a process called xe2x80x9clog flushingxe2x80x9d at the same time as data is flushed to disc from main memory. These schemes are disclosed and claimed in concurrently filed,copending U.S. patent application Ser. No. 60/099,271 entitled xe2x80x9cMethod and Apparatus for Detecting and Recovering from Data Corruption via Read Prechecking and Deferred Maintenance of Codewords,xe2x80x9d of the same inventors.