The extensive data storage needs of modern computer systems require large capacity mass data storage devices. A common storage device is the magnetic disk drive, a complex piece of machinery containing many parts which are susceptible to failure. A typical computer system will contain several such units. The failure of a single storage unit can be a very disruptive event for the system. Many systems are unable to operate until the defective unit is repaired or replaced, and the lost data restored.
As computer systems have become larger, faster, and more reliable, there has been a corresponding increase in need for storage capacity, speed and reliability of the storage devices. Simply adding storage units to increase storage capacity causes a corresponding increase in the probability that any one unit will fail. On the other hand, increasing the size of existing units, absent any other improvements, tends to reduce speed and does nothing to improve reliability.
Recently there has been considerable interest in arrays of direct access storage devices, configured to provide some level of data redundancy. Such arrays are commonly known as "RAIDs" (Redundant Array of Inexpensive (or Independent) Disks). Various types of RAIDs providing different forms of redundancy are described in a paper entitled "A Case for Redundant Arrays of Inexpensive Disks (RAID)", by Patterson, Gibson and Katz, presented at the ACM SIGMOD Conference, June, 1988. Patterson, et al., classify five types of RAIDs designated levels 1 through 5. The Patterson nomenclature has become standard in the industry.
The underlying theory of RAIDs is that a number of relatively inexpensive, small disk drives can provide the capacity of a single larger, expensive drive. The smaller drives will also be faster because they will all be reading or writing ("accessing") data at the same time. Finally, because the small drives cost so little, it is possible to include extra (redundant) disk drives which, in combination with certain storage management techniques, permit the system to recover the data stored on one of the small drives should it fail. Thus, RAIDs permit increased capacity, performance, and reliability.
Using the Patterson nomenclature, RAID levels 3 and higher (RAID-3, RAID-4, RAID-5) employ parity records for data redundancy. Parity records are formed from the Exclusive-OR of all data records stored at a particular location on different storage units in the array. In other words, in an array of N storage units, each bit in a block of data at a particular location on a storage unit is Exclusive-ORed with every other bit at that location in a group of (N-1) storage units to produce a block of parity bits; the parity block is then stored at the same location on the remaining (Nth) storage unit. If any storage unit in the array fails, the data contained at any location on the failing unit can be regenerated by taking the Exclusive-OR of the data blocks at the same location on the remaining devices and their corresponding parity block.
In a RAID-3, all the read/write actuators on the different disk drives act in unison to access data on the same location of each drive. RAID-4 and RAID-5 are further characterized by independently operating read/write actuators in the disk drive units. In other words, each read/write head of a disk drive unit is free to access data anywhere on the disk, without regard to where other units in the array are accessing data.
One of the problems encountered with parity protected disk arrays having independent read/writes (i.e., RAID-4 or RAID-5) is the overhead associated with updating the parity block whenever a data block is written. Typically, the data block to be written is first read and the old data Exclusive-ORed with the new data to produce a change mask. The parity block is then read and Exclusive-ORed with the change mask to produce the new parity data. The data and parity blocks can then be written. Thus, two read and two write operations are required each time data is updated.
In both RAID-3 and RAID-4, the parity records are stored on a single disk unit. U.S. Pat. No. 4,761,785 to Clark et al., which is hereby incorporated by reference, describes a type of independent read/write array in which the parity blocks are distributed substantially equally among the disk storage units in the array. Distributing the parity blocks shares the burden of updating parity among the disks in the array on a more or less equal basis, thus avoiding potential performance bottlenecks that may arise when all parity records are maintained on a single dedicated disk drive unit. Patterson et al. have designated the Clark array RAID-5. RAID-5 is the most advanced level RAID described by Patterson, offering improved performance over other parity protected RAIDs.
In theory, a RAID-5 as described in Clark equalizes the burden of parity updating among all the disk drives. However, in real systems it is very difficult to achieve an absolutely equal distribution of this burden. Data storage accesses tend to exhibit a phenomenon known as "locality" That is, some of the data stored in a disk array will be accessed much more frequently than other data. The more frequently accessed data tends to be clustered together. As a result, the disk unit having the parity block for the most frequently accessed data will carry more than its "fair share" of the parity updating burden.