Computer systems generally employ data storage devices, such as disk drive devices (or solid-state storage devices) for storage and retrieval of large amounts of data. The arrays of solid-state storage devices such as flash memory, phase change memory, memristors, or other non-volatile storage units, may also be used in data storage systems.
The most common type of a storage device array is the RAID (Redundant Array of Inexpensive (Independent) Drives). The main concept of the RAID is the ability to virtualize multiple drives (or other storage devices) in a single drive representation. A number of RAID schemes have evolved, each designed on the principles of aggregated storage space and data redundancy.
Many of the RAID schemes employ an error protection scheme commonly referred to as “parity” which is a widely used method in information technology to provide for tolerance in a given set of data. For example, in the RAID-5 data structure, data is striped across the hard drives, with a dedicated parity block for each stripe.
In the RAID-6 scheme, the block-level striping is performed with double distributed parity. This scheme tolerates up to two concurrent drive failures.
The parity blocks are computed by running the XOR comparison on each block of data in the stripe. The parity is responsible for the data fault tolerance. In operation, if one disk fails (or two disks for RAID-6 level) new drives can be put in place and the RAID controller can rebuild the data automatically using the parity data.
As hard disk sizes continually increase, the need for increased redundancy becomes more pressing. The CPU performance bottleneck has been a limiter in any move to increase RAID redundancies beyond RAID-6.
Current RAID engines generally use a CPU (or GPU) with a DMA (Direct Memory Access) capability attached to a large memory to perform XOR operations to generate parity. Typically, data to be striped across a set of drives is first written into the memory buffer of the CPU. The CPU then reads the data back in chunks (blocks) and calculates the XOR of the data to generate parity.
The parity XOR data is then written back to the memory, and subsequently is “flashed” to the storage disks. This method requires all of the data to be buffered in the memory of the CPU. The conventional centralized CPU scheme potentially may experience a bottleneck in data migration through the data storage system.
Referring to FIG. 1 which represents a typical RAID engine using a centralized CPU for computational operations, when a host 10 sends a “write” data request to storage devices 12, the data is first written to a memory 14 attached to the CPU 16. In this arrangement, the data is sent to a PCIe switch 18 that forwards it to the CPU 16 which in turn passes the data into the memory 14. A memory controller 20 within the CPU 16 controls data writing to and reading from the memory 14.
The CPU 16 reads the data from the memory 14, performs an XOR of the data, and then writes the data back into the memory 14. The CPU 16 then instructs the storage devices 12 to read the data and parity from the memory 14 and saves the data internally.
In this arrangement, all of the data is buffered in the memory 14, thus requiring an overly fast transfer rate of the data in the Memory Interface. This scheme requires the Memory Interface to the CPU to be 3× (+2× for parity) faster than the transfer array of data.
In addition, the reliance of the XOR (or any other requested Boolean logical or arithmetic) operation in this arrangement on an expensive CPU (and/or GPU), as well as the need for an additional software to be written for the CPU (and GPU) operation, results in a complex and expensive scheme, which also has a large footprint and elevated needs for cooling and power consumption.
It is therefore desirable to provide a data storage system which may perform computations in an efficient, inexpensive, and simple manner without reliance of the compute operation or buffering data in the CPU (or GPU), and which moves the RAID computations out of the CPU (or GPU), by-passing the central performance bottleneck, and eliminating the need to read in copies of the data into the CPU to make changes to the data, which typically requires multiple “hops” to and from the memory. In addition, a design is needed which can be scaled beyond RAID-6, for example to triple redundancy and beyond, at only a minor cost increase and without sacrificing the throughput.