1. Field of the Invention
This invention relates to recovery of parity data on electronic data mass storage systems known as RAID (Redundant Array of Independent Disks).
2. Related Art
RAID is a popular and well-known method used for storage and retrieval of data. It offers a data source that can be made readily available to multiple users with a high degree of data security and reliability.
In general, RAID is available in several configurations known as levels. Each of these levels offers at least one performance enhancement over a single drive (e.g. data mirroring, faster reads, data recovery). A popular feature of RAID, and probably the justification for its use in so many systems, is the ability to reconstruct lost data from parity information that is recorded along with the other data. Committing such large amounts of data to a RAID places a lot of trust in the RAID concept that data will be recoverable using the parity data in the event a failure occurs.
Problems can arise when a failure does occur and both the parity data and the other stored data are damaged. Without the parity information, it is impossible to recompute missing data.
A first known method used to combat this weakness is to log RAID stripes as they are written. In the event a crash occurs, the log can be used to determine which blocks should have their associated redundancy information recomputed. Variants of this technique include: logging the actual data, logging time-stamps and block numbers of blocks written, and logging stripe numbers and parity information to non-volatile memory.
Logs reduce the amount of parity information that has to be reconstructed on the RAID, which in turn reduces the amount of time that the array contains unprotected data. While the use of logs can combat some of the weakness in RAID implementation, it can require excessive overhead to maintain which in turn reduces data transfer rates. Additionally, data can be lost when logs are compromised.
A second known method is to xe2x80x9cstagexe2x80x9d the data and parity information to a pre-write area. Following a crash, the system can copy the data/parity information from the pre-write area to the RAID array. Use of a pre-write area requires data to be written twice; once to the pre-write area and then again to the actual stripe(s) in the array. This provides a more secure write transaction at the cost of reducing data transfer speed.
Accordingly, it would be desirable to provide a technique for enabling RAID failure recovery without the severe drawbacks of the known art.
The invention provides a method and system for RAID failure recovery due to a system crash that can function independently or as a supplemental and redundant recovery method to other RAID recovery strategies. A reparity bitmap is created with each bit representing N stripes within the RAID. When a write occurs to a stripe, the associated reparity bit is set to 1; otherwise the bit is set to its default value of zero.
Each bit in the reparity bitmap has an associated in-memory write counter. The write counter is used to track the number of writes in progress to a stripe range. Upon initiation of the first write to a stripe range, the reparity bit for the stripe range is set, and the write counter is incremented from its default value to indicate that one write is in progress. Subsequent, concurrent writes, cause the write counter to be incremented.
Upon completion of a write to the stripe range, the write counter is decremented. When all writes to the stripe range have been completed, the write counter will have returned to its default value, the reparity bit is cleared, and the reparity bitmap is written to disk. Using the write counter allows multiple writes to a stripe range without incurring two extra write I/Os (for the bitmap) per stripe write which greatly reduces overhead.
The writer first checks the reparity bitmap prior to executing a write. If the bit associated with that stripe is zero, the write counter is incremented for that reparity bitmap bit and the reparity bit is set to 1. The writer can proceed with the stripe write once the reparity bitmap is written to disk.
In the event the reparity bit is already set to 1, the writer increments the write counter and checks to see if the reparity bitmap is in the process of being written to disk. If the reparity bitmap is in the process of being written to disk, the writer waits for the reparity bitmap to be written and then writes the stripe; otherwise, the writer does not need to wait and writes the stripe without waiting.
If a system crash occurs, the reparity bitmap identifies those stripes that were in the process of being writtenxe2x80x94all other stripes are assured to be consistent. On reboot, the reparity bitmap is read by the RAID system and, if needed, recomputation of the data using parity information occurs on only those stripes whose associated reparity bit is set.
This summary has been provided so that the nature of the invention may be understood quickly. A more complete understanding may be obtained by reference to the following description of the preferred embodiments in combination with the attached drawings.