The present invention relates to a computer data storage scheme known as a Redundant Array of Independent Discs (RAID), and, in particular to a method and apparatus for generating parity for a RAID array.
RAID denotes a data storage scheme that can divide and replicate data across multiple hard disc drives. Various levels of RAID are denoted by a number following the word RAID, e.g., RAID-0, RAID-1, etc. The various levels of RAID are characterized by two key aspects namely data reliability or fault tolerance and increased input/output performance. Increased performance is obtained by distributing data over several hard disc drives, known as striping, thus distributing the load over more hardware. Data reliability or fault tolerance is obtained by redundancy, where additional information is stored so that data can be recovered in the event of failure of a hard disk drive.
Redundancy is achieved by use of parity blocks. If a drive in a RAID array fails, data blocks and a parity block from surviving drives may be combined to reconstruct the data in the failed drive.
A parity block may be generated by using the Boolean XOR function, for example parity data for two drives, Drive 1: 01101101; Drive 2: 11010100 is calculated as: Parity data=01101101 XOR 11010100=10111001. The resulting parity data is stored in a third or parity drive. Should any of the three drives fail the contents of the failed drive may be reconstructed by taking data from the surviving drives (Drive 3: 10111001; Drive 1: 01101101) and subjecting them to the same XOR calculation. Supposing drive 2 fails, drive 2 may be reconstructed as follows: Drive 2=10111001 XOR 01101101=11010100.
Performance of an individual hard disk drive is difficult to improve beyond fault tolerance. Individual physical hard disk drives are inherently slow and have a limited life-cycle. Nevertheless, fault-tolerance and performance of the system as a whole may be improved significantly through a suitable combination of physical hard disk drives.
RAID is a proven way to increase Mean Time Between Failures (MTBF) of an array of storage discs used in servers/computers. Levels of RAID include: RAID-1, RAID-2, RAID-4, RAID-5, RAID-6, RAID-10, RAID01, etc. RAID-5 includes a striped set with distributed parity. Upon drive failure, data in the failed drive may be reconstructed from the distributed parity such that the drive failure may be masked from the end user. RAID-5 can tolerate one disk failure. RAID-6 extends RAID-5 by adding an additional parity block, using block level striping with two parity blocks distributed across all member discs. Two parities may protect against two simultaneous disc failures, thereby improving reliability. Thus, RAID-6 can tolerate two simultaneous disc failures.
Although, the terms reliability and fault tolerance are often used interchangeably in describing RAID schemes, there is a distinction between them. Reliability refers to the likelihood that an individual drive or drive array will continue to function without experiencing a failure. Reliability is typically measured over some period of time. Fault tolerance, on the other hand, is an ability of an array to withstand and recover from a drive failure. Fault tolerance is provided by some sort of redundancy including mirroring, parity, or a combination of both. Fault tolerance is typically measured by the number of drives that can fail without causing an entire array to fail.
A bundle of physical hard discs may be brought together by a RAID controller. The RAID controller may distribute data over several physical hard disks and may be completely hidden to an associated server or computer. The RAID controller may store redundant information. If a physical hard disk fails, its data may be reconstructed from the hard disks that survive. The RAID controller may initiate this process automatically. If a hard disk drive fails, the RAID controller may immediately begin to reconstruct the data from a remaining intact disk into a hot spare disk. Recreation of data from a defective hard disk may take place at the same time that read and write operations of the server take place to the bundle of hard disks.