A data storage array is a collection of storage elements that are accessible by a host computer as a single storage unit. The individual storage elements can be any type, or a combination of types, of storage devices such as, hard disk drives, semiconductor memory, optical disks, magnetic tape drives, and the like. The individual storage elements may also be a track in a single magnetic tape.
A data storage array includes a collection of storage elements and a controller. The controller controls the operation of the storage elements and presents them as a single storage unit to a host computer. The host computer typically executes an operating system and application programs. A virtual storage element is an abstract entity realized by the controller and the data storage elements. A virtual storage element is functionally identical to a single physical storage element from the standpoint of the host computer.
One such data storage arrays is a redundant array of independent disks (RAID) or tapes (RAIT). RAID comes in various operating levels which range from RAID level 0 (RAID-0) to RAID level 6 (RAID-6). Additionally, there are multiple combinations of the various RAID levels that form hybrid RAID levels such as RAID-5+, RAID-6+, RAID-10, RAID-53 and so on. Each RAID level represents a different form of data management and data storage within the RAID disk array.
In a RAID-4 array, data is generally mapped to the various physical storage elements in data “stripes” across the storage elements and vertically in a “block” within a single storage element. To facilitate data storage, a serial data stream is partitioned into data blocks. Each data block is stored on a different storage element as the data blocks are striped across the storage elements. Once all the storage elements in a data stripe have been given data blocks, the storage process returns to the first storage element in the data stripe, and stripes data blocks across all the storage elements again.
In a RAID-4 array, data consistency and redundancy is assured using parity data that is striped to one of the storage elements. Specifically, a RAID-4 array contains N−1 data storage elements and a parity storage element. Each data stripe contains N−1 data blocks and one parity block. N−1 data blocks are striped to respective N−1 data storage elements and a parity block is striped to the parity storage element. The process then continues for the next data stripe.
RAID-4 parity is generated using an exclusive OR (XOR) function. In general, parity data is generated by taking an XOR function of the data blocks within a given data stripe. Using the parity information, the contents of any block on any single one of the storage elements in the array can be regenerated from the contents of the corresponding blocks on the remaining storage elements in the array. Consequently, if the XOR of all corresponding blocks in the data stripe, except one is computed, the result is the remaining block. Thus, if data storage element one in the storage element array should fail, for example, the data block it contains can still be delivered to applications by reading corresponding blocks (data and parity) from all the surviving storage elements and computing the XOR of their contents. As such, the RAID-4 array is said to be fault tolerant, i.e., the loss of one storage element in the array does not impact data availability.
In a data storage array, the controller accumulates parity as the data blocks are sequentially written from the host computer to the storage elements. When all of the data blocks in a data stripe have been written to respective data storage elements, the controller writes the parity block to the parity storage element. The controller then reinitializes and starts accumulating parity again for the next data stripe to be written to the storage elements. This continues until all of the data blocks have been written to the storage elements.
During normal reading operation, the controller sequentially passes the data blocks to the host computer from the storage elements and discards the parity blocks. If a read error occurs such that a data block is unreadable, the controller returns to the storage element having the first data block in the data stripe and then sequentially rereads the data blocks prior to the bad data block. The controller accumulates parity for the reread data blocks. The controller then skips the bad data block and continues sequentially reading the remaining data blocks and the parity block in the data stripe. The controller then reconstructs the bad data block from three factors. These factors are the parity of the data blocks prior to the bad data block, the data blocks after the bad data block, and the parity block.
A problem with rereading the data blocks prior to the bad data block for reconstructing the bad data block is that this operation is time consuming. What is needed is a data storage array method and system for accumulating the parity of data blocks during sequential reading. Thus, it would not be necessary to reread the data blocks prior to a bad data block as the controller would already have the information (parity of the data blocks already read) needed to complete the reconstruction of the bad data block.