Computer systems utilize data redundancy schemes such as parity computation to protect against loss of data on a storage device. A redundancy value is computed by calculating a function of the data of a specific word size across a quantity of similar storage devices, also referenced as data drives. One example of such redundancy is exclusive OR (XOR) parity that is computed as the binary sum of the data; another common redundancy uses Reed-Solomon codes based on finite field arithmetic. A plurality of redundancy values (hereinafter referenced as parity values) are stored on a plurality of additional storage devices, also referenced as parity drives. In the case of a parity drive failure or loss of data on the parity drive, the data on the parity drive can be regenerated from data stored on the data drives. In the case of data drive failure or loss of data on the data drive, the data on the data drive can be regenerated from the data stored on the parity drives and other non-failing data drives. Data can be regenerated, for example, from the parity drives by adding the data on the remaining data drives and subtracting the result from data stored on the parity drives.
In Redundant Arrays of Independent Disk (RAID) systems, data files and related parity are striped across multiple disk drives. In storage subsystems that manage multiple hard disk drives (herein referenced interchangeably as disks or drives) as a single logical direct attached or network attached storage device (DASD/NASD), the RAID logic is implemented in an array controller of the subsystem. Such RAID logic may also be implemented in a host system in software.
Disk arrays, in particular RAID-3 and RAID-5 disk arrays, have represented accepted designs for highly available and reliable disk subsystems. In such arrays, the exclusive-OR of data from some number of disks is maintained on a redundant disk (the parity drive). When a disk fails, the data on it can be reconstructed by exclusive-ORing the data on the surviving disks and writing this data into a spare disk. Data is lost if a second disk fails before the reconstruction is complete.
The most common RAID systems are based on parity schemes to provide added fault tolerance. For illustration purposes only, the RAID-5 system is described to illustrate the invention, with the understanding that other parity-based disk array systems may alternatively be used.
To update a small piece of data in a RAID-5 system, the RAID array controller first reads the old data in that location, reads the corresponding old parity from the corresponding parity drive, and XORs (exclusive ORs) these data with the new data to generate the new parity, after which the RAID array controller can write the new data to the data drive and the new parity to the parity drive. In other terms, the RAID array controller needs to perform a read-modify-write of the data drive and the parity drive. Each read of the data drive or the parity drive requires movement of a disk arm to the data being read; this movement is referenced as a “seek”. In systems with two or more parity drives, for example a RAID-6 system, one seek is required for each parity drive to read parity data during the write process.
Although this technology has proven to be useful, it would be desirable to present additional improvements, particularly since each seek requires a relatively significant amount of time in the overall write process. As a result, the read-modify-write operation imposes a performance penalty to the write command execution. What is therefore needed is a system, a computer program product, and an associated method for minimizing accesses to parity data drives by an array controller performing a write command.