The Background description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent it is described in this background section, as well as aspects of the description which may not otherwise qualify as prior art at the time of filing, are neither expressly or impliedly admitted as prior art against the present disclosure.
Performance in microprocessor and semiconductor memory technology continues to increase at a rapid pace. Drive storage technology has typically not kept pace. Redundant arrays of inexpensive disks (RAID) have been used to improve the data transfer rate and data input/output (I/O) rate over other types of disk access. RAID systems also provide greater data reliability at a low cost.
A RAID system distributes storage over multiple drives. When one of the drives fails, a RAID controller performs data recovery. It may also be desirable to add or remove a drive from the RAID system. The RAID system typically uses one or more parity drives to store error-correcting parity information and a plurality of data drives that store user information. If a data drive fails, the contents of the failed drive can be reconstructed using the information from the remaining data drives and the parity drive(s).
Each drive in a RAID system may generate its own Error Correction Code (ECC) and cyclic redundancy code (CRC). In addition, another layer of ECC may be added across the drives in the RAID system to handle drive failures.
Referring now to FIGS. 1 and 2, the structure of RAID system is shown in further detail. In FIG. 1, a storage system 10 includes data drives 12 d1-dk and parity drives 14 p1-pr. The number of data drives k may be different than the number of parity drives r. The data drives 12 and parity drives 14 communicate via a bus 16. The bus 16 may also communicate with a RAID control module 18.
The RAID control module 18 may include a RAID ECC encoder 20 and a RAID ECC decoder 22. The RAID control module 18 communicates with a system 24 such as a computer or a network of computers. The data storage system 10 stores and retrieves user information on the data drives 12. The RAID control module 18 generates ECC redundancy that is stored on the parity drives 14. The RAID control module 18 may use a cyclic code such as Reed Solomon (RS) ECC.
Let si(j) be the user data corresponding to the Logical Block Address (LBA) j on the ith data drive in the RAID system. The data bits in si(j) may be grouped into symbols if a non-binary ECC is used. A RAID ECC code word is formed by associating corresponding symbols across all of the data drives, i.e. w(j,l)=(s0(j,l), s1(j,l) . . . , sk-1(j,l)), where l=0,1, . . . L−1 enumerates RAID ECC symbols within si(j).
To recover one sector on a failed drive, the RAID control module 18 carries out L ECC decoding operations (one for each symbol). For example, if individual drives forming a RAID system have 0.5K byte sector format, and RAID ECC operates on a byte level (e.g. RAID ECC is RS ECC over GF(2^8)), then there are 512 RAID RS ECC codewords per each sector of a fixed component drive.
One simple example is a RAID system including two drives, where RAID ECC utilizes (2,1) repetition code. Consequently, both drives contain the same information. If the first drive (data drive) fails, then the second drive (parity drive) can be used to restore lost information.
Another exemplary of RAID system can employ Single Parity Bit Code-based ECC. For example, a RAID system may include k user drives and 1 parity drive (e.g. k=10). Let si(j) denote the sector from i-th drive corresponding to LBA=j. The RAID ECC encoder ensures that s0(j)+s1(j)+ . . . +s10(j)=0 for all possible LBA values j (here “+” refers to bitwise XOR operation, and 0 represents a sector long zero vector). If only one out of 11 drives fails, for example drive 0, then the lost data can be reconstructed from the other drives via s0(j)=s1(j)+ . . . +s10(j) for all valid LBA values j.
Referring now to FIG. 3, exemplary logical and physical locations of a RAID system 50 including twelve drives is illustrated. The RAID system may include four parity drives. Let si(j,l) represent the l-th symbol of a sector with LBA=j written on the i-th drive. Then s0(j,l), s1(j,l), . . . , s11(j,l) form the RS ECC codeword for all values of j and k.
The physical location of the drives within a RAID system 56 is illustrated with numerals 0-11. Arrows 58 illustrate the mapping between physical drive locations 0-11 and logical drive locations 520-5211, where logical drive location corresponds to an index of RS ECC codeword.
For example, the drive with physical location 10 stores a 1st symbol of each RAID RS ECC codeword. More locations may be added or removed when the requirements of the RAID system change. It is desirable to allow the RAID system to expand (add new data or parity drives) or contract (remove data or parity drives) without having to take the system offline for prolonged periods of time to perform maintenance.
Referring now to FIG. 4, when one of the drives is removed, the code length is changed. In this approach, the physical-to-logical map is changed. In step 100, the physical-to-logical map for a particular group of drives is determined. The order of logical locations does not necessarily correspond to the order of physical locations of each of the drives. In step 102, the code words are generated and saved to the drives in stripes. Code words for both the data and parity drives are generated. In step 104, the code words are stored on the data and parity drives.
In step 110, the system determines whether a data drive needs to be removed. This may or may not be due to drive errors. In step 112, logical locations of the data that are greater than the logical location of the removed drive are mapped toward lower-degree logical positions to remove a gap. By shifting the logical locations, a second map (physical-to-logical) is created. The logical locations in the second map have consecutive low-degree positions occupied. In step 114, the data drives are read and the code words for the data drives are generated. In step 116, the parity part of second code word is written to the parity drive(s).
When drives are later added to the RAID system, they are assigned a highest degree logical position. All of the drives are read and the parity drives are written.
Using the approach described above requires a read operation on all of the data drives and a write of the parity drives when adding, removing or modifying drives. This can reduce the amount of uptime of the RAID system.