The integrity of data stored in storage systems is typically protected by error detection code methods. One such method is the Reed-Solomon Cyclic Redundancy Check (R-S CRC). This method generates a code word for each data block stored in the storage system. The first k bits of an n bit code word represent the data block and the last n-k bits are the CRC bits. The CRC bits are created from a modulo-2 function of the data block k bits and a generator polynomial.
When the data is retrieved from the storage system, the data is checked for any errors. The check can be performed by executing the modulo-2 function of the data block k bits and the generator polynomial again. The resulting CRC bits are then compared to the previous CRC bits. If they are not equal, an error in either the data or its transmission occurred. Appropriate error correction techniques are then implemented.
Redundant arrays of inexpensive disks (RAID) are an example of a data storage system. RAIDs include the capability of correcting or recreating erroneous data and for remedying a complete failure of a disk in an array. Of the five RAID architectures 1-5, RAID 5 is presently the most popular since it provides high data reliability with a low overhead cost for redundancy, good data read performance and satisfactory data write performance.
RAID 5 utilizes a known error detection concept based on an exclusive-OR function and distributes the calculated parity bits, as well as the data, among all the disks in the array. The error detection is performed by an array controller, which also oversees other operations of the disk arrays. The array controller typically incorporates multiple small computer system interface (SCSI) buses.
The array controller provides and receives data from the disks of the array through associated disk controllers. Each of these disk controllers can have a high bandwidth, such as 160 Mbytes/s. However, the bandwidth between the disk medium and the disk controller can be much less, such as 20 Mbytes/s. As a result, the disk controller has a considerable amount of bandwidth remaining that can be employed for other functions.
Transferring the error detection operation from the array controller to each disk controller has been proposed to take advantage of the remaining disk controller bandwidth. This operation transfer allows the use of a standard SCSI controller as an array controller for the RAID 5 architecture. The transfer would also reduce the amount of hardware in the array controller required to support the RAID 5 architecture and would free up bandwidth of the array controller. Moreover, the total overhead of the error detection operation for each disk controller would be less than the overhead for that operation in the array controller.
To transfer the error detection operation from the array controller to each disk controller, a Read-Modify-Write operation must be supported by each disk controller. The Read-Modify-Write operation is used any time a write operation needs to be performed on the array, such as the writing of data from a host.
There are two architectures that an array controller must support: a multiple interface striping configuration and single interface striping configuration. Briefly, striping is storing of data constituents, anywhere from a bit to a disk sector, onto separate disk drives of the array at related addresses. A multiple interface configuration has the disk drives of the array on separate buses or loops. A single interface configuration has the disk drives on the same bus or loop. A disk drive of any array can communicate directly with another disk drive in the single interface configuration, but must communicate with the other disk drive through the array controller in a multiple interface configuration.
For both of these architectures, the disk controller of each disk drive should have the capability to store new data, perform a logic function (such as an exclusive-OR (XOR) function) on that new data with old data from the logical address of the new data, and output the result of that XOR function for use by a parity drive. The disk controller should also have the capability to receive from another drive in the array parity bits from an XOR function of the old and new data, XOR that result with the old parity bits corresponding to the old data stored on the other drive, and store the resulting new parity bits to the address of the old parity bits. Further detail is available in "RAID 5 Support on SCSI Disk Drives", Rev. 1.5, Seagate Technology, which is hereby incorporated by reference.
An example of these capabilities is shown in FIG. 1, which illustrates a multiple interface striping configuration with a distributed XOR function. An array controller 10 receives new data from a host (not shown) via a lead 12 and stores the new data in a new data buffer 14. New data from new data buffer 14 is supplied to a port 16 via a lead 18. From port 16, the new data is supplied to distributed XOR device 20 and disk medium 22 of data disk drive 24 via lead 26. XOR device 20 performs an exclusive-OR function on the new data and the old data supplied from disk medium 22 via lead 21. The old data is retrieved from the old data address on disk medium 22. A CRC check of the data is also performed. The result from XOR device 20 is supplied via lead 27 to buffer 28. The host then reads the XOR result from buffer 28 to port 16 over a lead 30.
Port 16 provides the result to XOR buffer 32 via lead 34. The result is then output from port 36 via lead 38 to parity drive 40 via lead 42. Distributed XOR device 44 of parity drive 40 receives the result and performs an exclusive-OR function on the result and the old parity bits from disk medium 46. For a single interface striping configuration, the output from buffer 28 of data disk drive 24 would be supplied directly to XOR device 44 of parity drive 40. The old parity bits corresponds to the old data of data disk drive 24. The resulting new parity bits are written to the disk medium 46.
To support the distributed XOR function, distributed XOR device 20 of data disk drive 24 in FIG. 1 contains XOR buffering. A traditional arrangement would utilize two separate buffers. To illustrate, FIG. 2 shows a distributed XOR device 20 having data buffers 50 and 60, along with temporary buffer 70 and XOR logic 80.
Still referring to FIG. 2, old data is written to data buffer 50 in response to a write command from the host (not shown) and stored in temporary buffer 70. New data from the host is stored in a data buffer 60. XOR logic 80 receives the data from both buffers 60 and 70, and performs an exclusive-OR function on both data. The result of the exclusive-OR function is then stored in temporary buffer 70. Since temporary buffer 70 is usually not large enough to store the entire contents of the result, the result is then stored in buffer 28. Buffer 28 can then supply data to either array controller 10 or parity disk drive 40 (FIG. 1), depending on whether a multiple or single interface striping arrangement is used.
Disadvantages of the traditional arrangement described above are its costs. If data buffers 50 and 60 are large, then temporary buffer 70 must be correspondingly large, which is expensive. If temporary buffer 70 is not large enough to store the result of the exclusive-OR function, additional buffer bandwidth must be used to write the result to buffer 28. If temporary buffer 70 is too small, then the old and new data cannot not be read in long bursts from data buffers 50, 60. This exacts a time cost since data will take longer to be read from data buffers 50, 60. Buffer 28 also imparts a time delay.
In addition, a read from and a write to temporary buffer 70 occurs for every data buffer 60 read. Consequently, temporary buffer 70 must maintain twice the bandwidth required for reading data buffer 60. The bandwidth of temporary buffer 70 will either put a limit on the bandwidth of distributed XOR device 20, or will be expensive to implement in hardware.