As the performance of microprocessor and semiconductor memory technology improves, there is a need for improved data storage systems with comparable performance enhancements. Additionally, in enhancing the performance of data storage systems, there is a need for improved reliability of data stored. In 1988, a paper was published by Patterson, Gibson, Katz, A Case for Redundant Arrays of Independent Disks (RAID), International Conference on Management of Data, pgs. 109–116, June 1988. This paper laid the foundation for the use of redundant arrays of independent disks that would not only improve the data transfer rate and data I/O rate over a comparable single disk access, but would also provide error correction at a lower cost in data storage systems.
RAID may include an array of disks which may be coupled to a network server. The server, e.g., file server, database server, web server, may be configured to receive a stream of requests (Input/Output (I/O) requests) from clients in a network system to read from or write to particular disks in the RAID. The I/O requests may also be issued from an application within the server. The server may comprise a RAID controller which may be a hardware and/or software tool for providing an interface between the server and the array of disks. The server may forward the I/O requests to the RAID controller which may retrieve or store the requested data. Typically, the RAID controller manages the array of disks for storage and retrieval and views the disks of the RAID separately. The disks included in the array may be any type of data storage system which may be controlled by the RAID controller when grouped in the array.
The RAID controller may typically be configured to access the array of disks as defined by a particular “RAID level.” The RAID level may specify how the data is distributed across the disk drives and how error correction is accomplished. In the paper noted above, the authors describe five RAID levels (RAID Level 1-RAID Level 5). Since the publication of the paper, additional RAID levels have been designated.
RAID levels are typically distinguished by the benefits included. Three key benefits which may be included in a RAID level are fault tolerance, data availability and high performance. Fault tolerance may typically be achieved through an error correction method which ensures that information can be reconstructed in the event of a disk failure. Data availability may allow the data array to continue to operate with a failed component. Typically, data availability may be achieved through a method of redundancy. Finally, high performance may typically be achieved by simultaneous access to multiple disk drives which results in faster I/O and data transfer requests.
Error correction may be accomplished, in many RAID levels, by utilizing additional parity data stored with the original data. Parity data may be utilized to recover lost data due to disk failure. Parity data may typically be stored on one or more disks dedicated for error correction only or distributed over all of the disks within an array.
By the method of redundancy, data may be stored in multiple disks of the array. Redundancy is a benefit in that redundant data allows the storage system to continue to operate with a failed component while data is being replaced through the error correction method. Additionally, redundant data is more beneficial than back-up data because back-up data is typically outdated when needed whereas redundant data is current when needed.
In many RAID levels, redundancy may be incorporated through data interleaving which distributes the data over all of the data disks in the array. Data interleaving is usually in the form of data “striping” in which data to be stored is broken down into blocks called “stripe units” which are then distributed across the array of disks. Stripe units are typically predefined as a bit, byte, block or other unit. Stripe units are further broken into a plurality of sectors where all sectors are an equivalent predefined size. A “stripe” is a group of corresponding stripe units, one stripe unit from each disk in the array. Thus, “stripe size” is equal to the size of a stripe unit times the number of data disks in the array.
In an example, RAID level 5 utilizes data interleaving by striping data across all disks and provides for error correction by distributing parity data across all disks. For each stripe, all stripe units are logically combined with each of the other stripe units to calculate parity for the stripe. Logical combination may be accomplished by an exclusive or (XOR) of the stripe units. For N physical drives, N−1 of the physical drives will receive a data stripe unit for the stripe and the Nth physical drive will receive the parity for the stripe. For each stripe, the physical drive receiving the parity data rotates such that all parity data is not contained on a single disk.
The array of disks in a RAID may include a disk commonly referred to as a “host spare” disk that is dedicated for storing data from a failed disk. The hot spare disk may be unused during normal operations, i.e., when there does not exist a failed disk. When a disk in the RAID fails, the data that used to be on the failed disk may be rebuilt to the hot spare disk. Once the failed disk is either repaired or replaced by a spare disk, the data on the repaired or spare disk may be rebuilt using the data on the hot spare disk.
In another embodiment, the hot spare disk may be distributed among the array of disks in the RAID to lessen the number of reads and writes to complete the rebuild. For example, each stripe may comprise a stripe unit designated to store data corresponding to a stripe unit from a failed disk. These stripe units designated to store data corresponding to a stripe unit from a failed disk in the same stripe may commonly be referred to as spare units. Hence, once a disk in the disk array in the RAID fails, the data that used to be on the failed disk may be rebuilt to spare units distributed among the remaining disks in the disk array. The process of reconstructing the data contained on the failed disk and copying it onto the spare unit of each stripe may be referred to as “compaction.” Once the data contained on the failed disk has been rebuilt onto the spare unit of each stripe, the RAID may be said to be in a “compacted state.” RAID may be said to be in a “compacted stated” since the number of active disks in the disk array is decreased by one.
Once the failed disk is either repaired or replaced by a spare disk, the data on the failed disk may be restored by copying back the spare unit data onto the corresponding stripe units in the repaired or spare disk stripe by stripe. The process of copying back the spare unit data onto the repaired or spare disk stripe by stripe may be referred to as “expansion.” During the expansion process, each stripe that is updated, i.e., each stripe unit per stripe that is rebuilt in the repaired or spare disk, may be tracked. Typically, each stripe that is updated may be tracked by a table stored in a non-volatile memory in the RAID controller. Each stripe updated during the expansion process may be tracked in case the repaired or spare disk fails during the expansion process. If the repaired or spare disk fails during the expansion process, RAID is able to enter a compacted state since the last stripe updated is known. That is, the data that used to be on the failed disk up to the last stripe updated may be rebuilt to spare units distributed among the remaining disks in the disk array thereby allowing the disk array to enter a compacted state. However, persistently tracking each stripe updated during expansion is time-consuming. Furthermore, non-volatile memory is expensive.
It would therefore be desirable to eliminate the necessitating of tracking the stripes updated during expansion in order to enter a compacted state upon a failure of the updated disk during expansion.