There is a paradigm shift in transaction processing (OLTP) from perspectives of both hardware and software. The hardware trends are toward a cheaper and larger main-memory and larger numbers of cores per processor. These trends are paving the way for OLTP databases to become entirely memory-resident (substantially faster latency) and to potentially support more concurrent environment (substantially faster throughput). The software trend is the rise of multi-version databases that avoid in-place updates of records and retain the history of data (the old and new versions of the modified record).
For state of the art optimistic models, transactions follow a strict serial path as follows: (i) reading a set of records (read phase); (ii) performing any arbitrary computation (compute phase); (iii) validating that the read records have not been changed by other transactions (validate phase); (iv) writing a set of records (write phase); and (v) committing the transaction (commit phase).
In a multi-version database system, new records do not physically replace old ones. Instead, a new version of the record is created, which becomes visible to other transactions at commit time. Conceptually, there may be many rows for a record, each corresponding to the state of the database at some point in the past. Older versions may be garbage-collected as the need for old data diminishes, in order to reclaim space for new data.
The concurrent access of the data by different transactions poses an obstacle when relying on traditional locking because reader and writer transactions are incompatible and block each other. Therefore, as the concurrency increases and resource contention between the readers and writers increases, that is, an exponential increase of processor's core count and increase in the size of main-memory, the overall utilization of a system deteriorates. This effect is further magnified when, in addition to typical short update transactions, there are long running read-only transactions that hold read locks for an extended period of time, which could essentially bring the database to stall.
The conflict between readers and writers, especially those of long readers, limits the prospect of single-version concurrency. A naive (and rather common) approach is to deal with this limitation by relaxing the consistency model and settling for transaction-inconsistent answers to queries, or by relying on an existing multi-version concurrency model (MVCC).
By keeping old data versions, a system can enable queries about the state of the database at points in the past. The ability to query the past has a number of important applications, for example: (i) a financial firm being required to retain any changes made to client information for up to five years in accordance with auditing regulations; (ii) a retailer ensuring that they offer only one discount for each product at any given time; and/or (iii) a bank retroactively correcting an error for miscalculating the promised introductory interest rate. In addition to these business-specific scenarios, there is an inherent algorithmic benefit from retaining the old versions of records and avoiding in-place updates, that is, to utilize efficient optimistic locking and latch-free data structures.
A simple implementation of a multi-version database would store the row-identifier (RID) of the old version within the row of the new version, defining a linked list of versions. Such an implementation allows for the easy identification of old versions of each row, but puts the burden of reconstructing consistent states at particular times on the application, which would need to keep timing information within each row.
To relieve applications of such burdens, a multi-version database system can maintain explicit timing information for each row. In a valid time temporal model each row is associated with an interval (begin-time, end-time) for which it was/is current. Several implementation choices exist for such a model. One could store the begin-time with each new row, and infer the end-time as the begin-time of the next version. Compared with storing both the begin-time and end-time explicitly for each row, this choice saves space and also saves some write I/O to update the old version. On the other hand, queries over historical versions are more complex because they need to consult more rows to reconstruct validity intervals.
There are several options for the physical organization of a multi-version database. For example, one organization option appends old versions of records to a “history” table and only keeps the most recent version in the main table, updating it in-place. Commercial systems have implemented this technique.