A typical data storage system includes one or more arrays of magnetic disk drives or similar non-volatile storage devices, and a controller that controls the manner and locations in which data is written to and read from the devices. It is important that a host system be able to reliably access all of the data in the data storage system. However, a potential problem that affects data storage systems is that one or more of the devices can fail or malfunction in a manner that prevents the host system from accessing some or all of the data stored on that device.
A redundant array of inexpensive (or independent) disks (RAID) is a common type of data storage system that addresses the above-referenced reliability problem by enabling recovery from the failure of one or more storage devices. For example, in the system illustrated in FIG. 1, a RAID controller 10 controls a storage array 12 in a manner that enables such recovery. A host (computer) system 14 stores data in and retrieves data from storage array 12 via RAID controller 10. That is, a processor 16, operating in accordance with an application program 18, issues requests for writing data to and reading data from storage array 12. Although for purposes of clarity host system 14 and RAID controller 10 are depicted in FIG. 1 as separate elements, it is common for a RAID controller 10 to be physically embodied as a card that plugs into a motherboard or backplane of such a host system 14.
It is known to incorporate data caching in a RAID system. In the system illustrated in FIG. 1, RAID controller 10 includes a RAID processing system 20 that caches data in units of blocks, which can be referred to as read cache blocks (RCBs) and write cache blocks (WCBs). The WCBs comprise data that host system 14 sends to RAID controller 10 as part of requests to store the data in storage array 12. In response to such a write request from host system 14, RAID controller 10 caches or temporarily stores a WCB in one or more cache memory modules 22, then returns an acknowledgement message to host system 14. At some later point in time, RAID controller 10 transfers the cached WCB (typically along with other previously cached WCBs) to storage array 12. The RCBs comprise data that RAID controller 10 has frequently read from storage array 12 in response to read requests from host system 14. Caching frequently requested data is more efficient than reading it from storage array 12 each time host system 14 requests it, since cache memory modules 22 are of a type of memory, such as flash memory, that can be accessed much faster than the type of memory (e.g., disk drive) that data storage array 12 comprises.
Various RAID schemes are known. The various RAID schemes are commonly referred to by a “level” number, such as “RAID-0,” “RAID-1,” “RAID-2,” etc. As illustrated in FIG. 1, storage array 12 in a conventional RAID-5 system can include, for example, four storage devices 24, 26, 28 and 30 (e.g., arrays of disk drives). In accordance with the RAID-5 scheme, data blocks, which can be either RCBs or WCBs, are distributed across storage devices 24, 26, 28 and 30. Distributing logically sequential data blocks across multiple storage devices is known as striping. Parity information for the data blocks distributed among storage devices 24, 26, 28 and 30 in the form of a stripe is stored along with that data as part of the same stripe. For example, RAID controller 10 can distribute or stripe logically sequential data blocks A, B and C across corresponding storage areas in storage devices 24, 26 and 28, respectively, and then compute parity information for data blocks A, B and C and store the resulting parity information P_ABC in another corresponding storage area in storage device 30.
A processor 32 in RAID processing system 20 is responsible for computing the parity information. Processing system 20 includes some amount of fast local memory 34, such as double data rate synchronous dynamic random access memory (DDR SDRAM) that processor 32 utilizes in the parity computation. To compute the parity in the foregoing example, processor 32 reads data blocks A, B and C from storage devices 24, 26 and 28, respectively, into local memory 34 and then performs an exclusive disjunction operation, commonly referred to as an Exclusive-Or (XOR), on data blocks A, B and C in local memory 34. Processor 32 then stores the computed parity P_ABC in data storage device 30 in the same stripe in which data blocks A, B and C are stored in data storage devices 24, 26 and 28, respectively. The above-described movement of cached data and computed parity information is indicated in a general manner in broken line in FIG. 1.
The RAID-5 scheme employs parity rotation, which means that RAID controller 10 does not store the parity information for each stripe on the same one of data storage devices 24, 26, 28 and 30 as the parity information for all other stripes. For example, as shown in FIG. 1, parity information P_DEF for data blocks D, E and F is stored on storage device 28, while data blocks D, E and F are stored in the same stripe as parity information P_DEF but on storage devices 24, 26 and 30, respectively. Similarly, parity information P_GHJ for data blocks G, H and J is stored on storage device 26, while data blocks G, H and J are stored in the same stripe as parity information P_GHJ but on storage devices 24, 28 and 30, respectively. Likewise, parity information P_KLM for data blocks K, L and M is stored on storage device 24, while data blocks K, L and M are stored in the same stripe as parity information P_KLM but on storage devices 26, 28 and 30, respectively.