A database is a collection of data organized usefully and fundamental to some software application (e.g., an information management system). The database is associated with a database manager ("DBM"), which itself is software-based and performs a range of tasks on the database, usually for the software application, the range of tasks varying largely upon the intended use of the database and the sophistication of the DBM.
Traditionally, databases have been stored in non-volatile (disk) memory, while DBMs and software applications have resided, at least in pertinent part, in volatile (main) memory. DBMs have been distinguished by the manner in which they process and manipulate the data with which they are charged. For example, some DBMs only manipulate one data file at a time (flat-file DBMS), others process multiple data files at one time, associating data from several different data files (relational DBMs).
Fundamental DBM operations include storing data, creating indexes that allow retrieval of data, linking data from different files (relational DBMS), etc. Two of the most important operations, and hence most sophisticated, performed by DBMs are data integrity and database recovery.
Data integrity, very simply, insures that one software application cannot modify a particular data file while another software application is relying upon the contents of the same. Database recovery, on the other hand, involves rebuilding the database after part or all of its data is corrupted--data corruption may be caused by a power outage, a program crash or the like that causes the DBM to suspect that at least part of the data stored therein has been lost or damaged.
Today, many software applications require high performance access to data with response time requirements on the order of a few to tens of milliseconds. Traditional non-volatile (disk) memory DBMs have been largely incapable of meeting the high performance needs of such applications (often due to the latency of accessing data that is non-volatile memory-resident).
In an effort to improve performance, one approach provides a large buffer (volatile memory-resident) for loading a portion of the database therein for faster access. A fundamental problem arises however when a conventional disk-based architecture is adopted to implement such a system. Most disk-based architectures have a buffer manager. Page requests result in searching the memory buffer to see if the requested page is already resident there. Thus, even if a page were cached in the memory buffer, access to data on the page requires locating the page and "pinning" it in the buffer. These transactions tend to substantially increase processing overhead.
Another approach maps the entire database directly into volatile (main) memory. The data may be accessed either directly by virtual memory pointers, or indirectly via location independent database offsets that quickly translate into memory addresses (therefore, no need for data requests to interact with a buffer manager, either for locating data, or for fetching/pinning buffer pages). Data access using a main-memory database is much faster than disk-based storage managers--pages need not be written to disk during normal processing to make space for other pages.
A significant danger exists however if a portion or all of the main memory database becomes corrupted then, unlike non-volatile (disk) memory databases, the entire database may need to be recovered. One recovery approach uses undo log records that are used to track the progress of transactions that have modified the database in some manner. Traditional recovery schemes implement write-ahead logging ("WAL"), whereby all undo logs for updates on a page are "flushed" to disk before the page is flushed to disk. To guarantee (trueness) the WAL property and, hence, the recovery method, a latch is held on the page (or possibly on some system log) while the page is copied to disk, and, thus, reintroducing disk memory processing costs as such latching tends to significantly increase access costs to non-volatile memory, increase programming complexity, and interfere with normal processing.
What is needed in the art is a way of restoring a corrupted volatile memory database that does not significantly increase processing costs, such as those costs associated with conventional recovery methods. What is further needed is a recovery scheme that substantially reduces the duration of latches on pages during updates.