Many contemporary application tasks use database systems to store, retrieve and, even, process data. Database systems typically include a database manager ("DBM") and a database (i.e., a data repository). A DBM is a control application that supervises or manages interactions between application tasks and the database. Such supervision and management vary among DBMs based on a given DBM's sophistication and the database's intended use. Arguably, two of the most important DBM functions are those that ensure data recovery (in response to a database crash) and data integrity.
Data recovery involves rebuilding at least part of a database after all or part of its data is corrupted/lost, such as caused by a power outage, a program crash, or the like. If certain data is corrupted/lost, the DBM will "recover" at least the portion affected; recovery is usually to a last known valid or "uncorrupt" state. When database recovery efforts are undertaken, extended time delays are expected.
With respect to data integrity, however, time delays or latencies (time differential between a request for data and actual receipt of the same) are largely intolerable. Early database systems were divided among main (volatile) and disk (non-volatile) memory; DBMs and application tasks resided, at least in part, in volatile memory, while the database was stored in non-volatile memory. Such systems, and their "disk"-based successors, have failed to meet performance requirements of contemporary high-speed information management systems ("IMSs," such as communications switching systems). This has frequently been due to latencies inherent to non-volatile memory transactions (e.g., accesses, retrievals, modifications, indices, copies, etc.), exacerbated by data integrity techniques.
Contemporary IMSs demand fast and predictable transaction response times, particularly for transactions that do not modify or otherwise change a given database ("read-only transactions"). One popular methodology maps the entire database into volatile memory (a "main memory database") to improve IMS performance, particularly transaction response times. Unfortunately, to ensure data integrity, conventional main memory DBMs delay the processing of transactions that modify portions of the database (termed "update transactions") until other transactions with respect to such portions are processed. For instance, if two transactions attempt to access the same file, entry, field, or the like (collectively, a "data record") simultaneously, contemporary DBMs ensure data integrity by preventing an update transaction from modifying the data record while the other relies on the contents of the same.
Database modifications however generally affect a small number of data records only. Typically, a DBM monitors a status of a data record that is the subject of an update transaction and grants a right to modify the same to the update transaction only when the data record is free (not otherwise being used or relied upon by another transaction). This right is commonly either a lock (i.e., control mechanism that prevents other transactions from getting to the same data records) or a latch (i.e., a semaphore--control mechanism that sends a message to other transactions indicating that another transaction is modifying or changing these data records), causing other transactions to "wait" to see the affected data record while the update transaction modifies the same.
Update transactions tend to be multi-step processes. As such, it is quite common for a DBM to require a given update transaction to wait between process steps while other update transactions complete. While waiting, the update transaction retains its data record locks or latches; these other update transactions also maintain their data record locks and latches. This can lead to interdependency conflicts that require DBM intervention to resolve.
Therefore, while main memory databases have increased speed, the above-described "waits" and conflicts provide a source of unpredictability to transaction throughput and database response time. This is particularly true for read-only transactions, requiring a simple "look and see" database access that may be severely delayed because of the same.
Contemporary control methodologies reduce conflicts between update and read-only transactions, giving the latter consistent, but "old" or out-of-date, views of certain data records or data record types. This is commonly referred to as multi-versioning, in which DBMs retain or archive multiple versions of recently updated data records for use by read-only transactions. Multi-version DBMs use time stamps to serialize read-only and update transactions, and, more recently, to serialize read-only transactions with respect to update transactions. These DBMs require update transactions to perform locking to serialize themselves with respect other update transactions, but not read-only transactions.
Multi-versioning techniques, while reducing "waits" and conflicts among transactions, often conflict with DBM efforts to utilize main memory capacity efficiently. As main memory remains significantly more expensive than disk memory, main memory DBMs continuously expend processing resources collecting or "ageing" old and no longer needed data record versions, regardless of main memory utilization. Contemporary versioning schemes fail to appreciate the various costs associated with collecting such data record versions, particularly failing to understand a tradeoff between ensuring continual and optimal main memory capacity and an efficient use of processing resources. Therefore, a need exists in the art for an efficient means of reclaiming main memory space no longer used by such multi-version techniques--to age, logically and economically, data record versions in a main memory database.