Computer systems generally employ data storage devices, such as disk drive devices, or solid-state storage devices, for storage and retrieval of large amounts of data. Usually the data storage devices are arranged in an array. The most common type of a storage device array is the RAID (Redundant Array of Inexpensive (Independent) Drives). The arrays of solid-state storage devices such as flash memory, phase change memory, memristors, or other non-volatile storage units, can also be used in data storage systems operating in accordance with RAID principles.
The RAID uses several inexpensive drives with a total cost which is less than the cost of a high-performance drive to obtain a similar performance with greater security. RAIDs use a combination of mirroring and/or parity for providing greater protection from lost data. For example, in some modifications of the RAID system, data is interleaved in stripe units distributed with parity information across all of the disk drives. The parity scheme in the RAID utilizes either a two-dimensional XOR algorithm or a Reed-Solomon Code in a P+Q redundancy scheme.
The main concept of the RAID is the ability to virtualize multiple drives (or other storage devices) in a single drive representation. A number of RAID schemes have evolved, each designed on the principles of aggregated storage space and data redundancy. There are five standard RAID levels originally conceived, but many more operations have evolved. Most commonly used RAID levels include:
RAID 0 which provides a block-level striping without parity or mirroring, which has no redundancy.
In RAID 1, which uses mirroring without parity or striping, data is written identically to two drives, thereby producing a mirrored set. “Read” request is serviced by either of the two drives containing the requested data, and a “write” request of data is written to both the drives.
In RAID 10, which uses mirroring and striping, data is written in stripe across the primary disk and then mirrored to the secondary disks.
In RAID 2 level, which is based on bit level striping with dedicated Hamming-code parity, rotations of all disks are synchronized, and data is striped such that each sequential bit is on a different drive.
In RAID 3 level, which is based on byte-level striping with dedicated parity, disks rotations are synchronized, and data is striped so each sequential byte is on a different drive. Parity is calculated across corresponding bytes and stored on a dedicated parity drive.
RAID 4, employs block-level data striping with dedicated parity. RAID 4 data distribution across drives is similar to RAID 3, but the granularity of the distributed data in RAID 4 (block-level) is coarser than that employed by RAID 3 (byte-level). In this setup, files may be distributed between multiple drives. Each drive operates independently, allowing I/O requests to be performed in parallel.
RAID 5 uses block-level striping with distributed parity and distributes parity along with the data and requires all drives, but one, to be present to operate. The array in this arrangement is not destroyed by a single drive failure. Upon drive failure, any subsequent reads can be calculated from the distributed parity such that the drive failure is masked from the end user.
RAID 6 level uses block-level striping with double distributed parity, and tolerates up to two concurrent drive failures.
An example RAID-5 storage device is illustrated in FIG. 1. In this example device, user data is aggregated into four stripes consisting of several blocks (blocks A1, B1, C1, D1, A2, . . . ). Each stripe also includes dedicated parity blocks (blocks Ap, Bp, Cp, and Dp) generated from the user data and a parity generation algorithm (such as an XOR scheme). Stripes are spread across all hard drives so that each drive absorbs one block from each stripe (either a parity block or data block). In this example, the parity block placement is shifted between stripes so that the parity data is distributed among all the drives. Each stripe block per drive can vary in size, for example, from 4 KB to 256 KB per stripe. The data block size can be modified during device setup or configuration to adjust performance.
Parity data allows RAID storages device to reconstruct lost or corrupted data. In the example illustrated in FIG. 1, the RAID can recover from a failure of Disk 1 by using the accessible user and parity data on Disk 0, Disk 2, and Disk 3 to reconstruct the lost data on Disk 1. For example, the RAID device uses data in blocks A1, A3, and Ap to reconstruct lost block A2.
Parity blocks are usually computed using the Exclusive OR (XOR) on binary blocks of data. An XOR comparison takes two binary bits, represented as “0” and “1”, compares them, and outputs an XOR result of “zero” or “one”. The XOR engine returns a “1” only if the two inputs are different. If both bits are the same, i.e., both “0”s or both 1”s, the output of the XOR engine would be “0”.
For example, as shown in Table 1:
for stripe 1, the XOR parity block may be placed in Drive 4,
for stripe 2, the XOR parity block may be placed in Drive 3,
for stripe 3, the XOR parity block may be placed in Drive 2, and
for stripe 4, the XOR parity block may be placed in Drive 1.
TABLE 1Drive 1Drive 2Drive 3Drive 4Stripe 10100010100100011Stripe 20010000001100100Stripe 30011000110101000Stripe 40110000111011010
The parity blocks are computed by running the XOR comparison on each block of data in the stripe. It means that the first two blocks are XOR-ed, then the result is XOR-ed against the third block, and the XOR comparison continues for all drives in the array, except for the block where the parity is stored.
Current RAID engines generally use a CPU with a DMA (Direct Memory Access) capability attached to a large memory to perform XOR operations to generate parity. Typically, data to be striped across a set of drives is first written into the memory buffer of the CPU. The CPU then reads the data back in chunks (blocks) and calculates the XOR of the data to generate parity. The parity XOR data is then written back to the memory, and subsequently is “flashed” to the disks. This method requires all of the data to be buffered in the memory of the CPU.
Referring to FIG. 2 representing a typical RAID engine using a CPU, a host 10 sends a “write” data request to storage devices 12. The data is first written to a memory 14 attached to the CPU 16. In this arrangement, the data is sent to a PCIe switch 18 which forwards it to the CPU 16 which, in turn, passes the data into the memory 14. A memory controller 20 within the CPU 16 controls data writing to and reading from the memory 14.
The CPU 16 reads the data from the memory 14, performs an XOR of the data, and then writes the computed parity back into the memory 14. The CPU 16 then instructs the storage devices 12 to read the data and parity from the memory 14, and store the data.
In this arrangement, all of the data is buffered in the memory 14, thus requiring an overly fast transfer rate of the data in the memory interface. This scheme requires the memory interface to the CPU to be greater than 3× faster than the transfer array of data.
In addition, the reliance of the XOR operation in this arrangement on an expensive CPU and/or GPU, as well as the need for an additional software to be written for the CPU and GPU operation, results in a complex and expensive scheme, which also has a large footprint and elevated needs for cooling and power consumption.
It is therefore desirable to provide XOR parity data generation in an efficient, inexpensive, and simple manner.