The RAID-5 standard describes a fault-tolerant architecture for storing data on disk storage devices. A plurality of disk drives are arranged into a storage array. Data is stored in the array in units termed stripes. Each stripe is partitioned into sub-units termed blocks, with one block of each stripe stored on one disk drive in the array. The storage array is protected against single-disk drive failures by assigning one block in each stripe to be the parity block for the stripe. RAID-5 provides excellent performance for large consecutive reads and batch loads, because each block in a stripe may be accessed in parallel with each other block. However, RAID-5 storage arrays have poor performance for the small updates typically found in transaction processing, because the parity block must be updated after even a small update.
Several schemes have been proposed to overcome this performance problem. For example, the scheme proposed by Savage and Wilkes ("AFRAID--A Frequently Redundant Array of Independent Disks", by Stefan Savage and John Wilkes, 1996 USENIX Technical Conference, Jan. 22-26, 1996) provides a greatly improved level of performance for RAID-5 arrays. This scheme defers the update to the parity block to periods in which the disk drive is idle, a situation which occurs frequently. However, this scheme also increases the vulnerability of the array to single disk drive failures, because of the likelihood that recently updated disk blocks will be lost when a disk drive fails.
The scheme proposed by Stodolsky et al. ("Parity Logging--Overcoming the Small Write Problem in Redundant Disk Arrays", by Daniel Stodolsky, Garth Gibson and Mark Holland, IEEE 1993, pp. 64-75) generates parity updates and logs them, rather than updating the parity immediately. When the log buffer is full, the parity updates are all written in one large update. This scheme preserves the reliability of the storage array, but only increases performance to the extent that the logging overhead plus the update overhead is less than the other overhead.
While the increased vulnerability of the Savage--Wilkes scheme may be tolerated in some applications, it is not acceptable in other applications, such as databases. A need arises for a technique which provides improved performance over standard RAID-5 without increasing vulnerability to single-disk drive failures.