As the requirements for On-Line Database Transaction Processing (OLTP) grow, high transaction rates on the order of thousands of transactions per second must be supported by OLTP systems. In applications such as OLTP, a large fraction of the requests are random accesses to data. Since a large fraction of the data resides on disks, the disk sub-systems must therefore support a large rate of random accesses, on the order of several thousands of random accesses per second.
Random disk Input/Output (I/O) performance is not improving at the same rate as other system parameters such as CPU MIPS. Therefore, applications such as OLTP, where random access to data predominates, are disk arm bound and the disk cost is becoming a larger fraction of the system cost. Thus, there is a need for a disk sub-system which can support a large rate of random accesses per second with a better price-performance characteristic than for traditional disk systems.
In a paper by M. Rosenblum and J. Ousterhout, entitled "The Design and Implementation of a Log-Structured File System", Proceedings of the Thirteenth ACM Symposium on Operating System Principles (October 1991), the basic observation is made that sequential disk I/O performance is improving because of increases in disk surface density, even though random disk I/O performance is not improving at the same rate. Therefore, Rosenblum and Ousterhout have proposed a solution that replaces random writes with sequential writes. More specifically, a log-structured file system is proposed that writes updates to a sequential log. Subsequently, garbage collection and compaction are used to create large free areas on disk.
A basic problem with this method is that clustering of data on the disk is lost. That is, by remapping a page (or block) of data to another location on disk, previously adjacent pages (or blocks) are moved to arbitrarily distant locations on disk. As a consequence, a later sequential reading of those formerly adjacent pages (or blocks) will require access to spaced apart locations on the disk (i.e., formerly sequentially accessed read operations will be converted into randomly accessed read operations).
While random accesses for reading is predominant in OLTP applications, significant sequential accesses also exist. For example consider the "Transaction Processing Council Benchmark C (TPC-C), Standard Specification, Revision 1.0", Edited by Francois Raab (Aug. 13, 1992). In the TPC-C benchmark, one of the transaction types is an order transaction that creates a table of customer orders. Later, a delivery transaction processes the oldest unsatisfied orders sequentially in a batch transaction. This leads to significant sequential access to the data in addition to the predominant random access. Further, queries against a database often access data via a table-space scan that is accessed in clustering order, i.e. according to the specific order on disk.