The present invention relates in general to data processing systems and, in particular, to use of storage element polymorphism to reduce performance degradation during error recovery.
Computer systems typically include a large amount of both nonvolatile data storage (e.g., Hard Disk Drives (HDD) or Solid State Drives (SSDs)) and volatile data storage (e.g., Random Access Memory (RAM)) to hold information such as operating system software, programs and other data. As is well known in the art, this information is commonly stored in binary format (1's and 0's), where each individual binary digit is referred to as a “bit” of data. Bits of data are often grouped to form higher level constructs, such as 8-bit “bytes” and 8 or 16-byte data “words.”
The circuits in RAM and SSD used to store voltage levels representing bits are subject to both device failure and state changes due to high energy cosmic rays and alpha particles. HDDs are also subject to device failures and to imperfections in the magnetic media that can change the magnetic fields representing the bits and thus not accurately represent the originally stored data. Depending on which bit is affected by an error, the error of just a single bit can cause an entire process, an entire partition, or even an entire computer system to fail. When an error occurs, whether a single bit error, multi-bit error, full chip/device failure or full memory module failure, all or part of the computer system may remain down until the error is corrected or repaired. Downtime attributable to individual errors and/or to all errors collectively can have a substantial impact on computer system performance and on a business dependent on its computer systems.
The probability of encountering an error during normal computer system operation has increased concomitantly with the capacity of data storage in computer systems. Techniques to detect and correct bit errors have evolved into an elaborate science over the past several decades. One of the most basic detection techniques is the use of odd or even parity where the 1's or 0's in a data word are exclusive OR-ed (XOR-ed) together to produce a parity bit. For example; a data word with an even number of 1's will have a parity bit of 0, and a data word with a odd number of 1's will have a parity bit of 1. If a single bit error occurs in the data word, the error can be detected by regenerating parity from the data and then checking to see that the calculated parity matches the originally generated parity stored with the word.
Richard Hamming recognized that this parity technique could be extended to not only detect errors, but also correct errors by appending a more intricate XOR field, referred to as an error correct code (ECC) field, to each code word. The ECC field is a combination of different bits in the code word XOR-ed together so that errors (small changes to the data word) can be easily detected, pinpointed and corrected. The number of errors that can be detected and corrected in a code word is directly related to the length of the ECC field. The challenge is to ensure a minimum separation distance between valid data words and code word combinations. As the number of errors that can be detected and corrected increases, the ECC field also increases in length, which creates a greater distance between valid code words (i.e., a greater Hamming distance). In current computer systems, RAM commonly is protected by ECC that supports Double-bit Error Detection (DED) and Single-bit Error Correction (SEC), which allows the RAM to recover from single-bit transient errors caused by alpha particles and cosmic rays, as well as single-bit hard errors caused by failure of RAM circuitry. The data held in HDDs are often similarly protected by checkers such as ECC, Cyclic Redundancy Checks (CRCs) and Longitudinal Redundancy Checks (LRCs).
In addition to error detection and correction facilities such as ECC, modern computer systems commonly protect data through data redundancy and/or distribution. For example, Redundant Array of Independent Disks (RAID) systems have been developed to improve the performance, availability and/or reliability of disk storage systems. RAID distributes data across several independent HDDs or SSDs. There are many different RAID schemes that have been developed each having different performance, availability, and utilization/efficiency characteristics.
RAID-0 is striping of data across multiple storage devices. RAID-1 is mirroring of data. RAID-3, RAID-4 and RAID-5 are very similar in that they all use a single XOR check sum to correct for a single data element error. RAID-3 is byte-level striping with a dedicated parity device. RAID-4 uses block level striping with a dedicated parity storage device. RAID-5 is block level striping like RAID-4, but distributes parity information substantially uniformly across all the storage devices rather than centralizing the parity information on a dedicated parity storage device. The key attribute of RAID-3, RAID-4 and RAID-5 is that each is capable of correcting a single data element fault when the location of the fault can be pinpointed through some independent means. This capability allows RAID-3, RAID-4 and RAID-5 to correct for a complete storage device failure. RAID-6 has no single universally accepted industry-wide definition. In general, RAID-6 refers to block or byte-level data striping with dual checksums. An important attribute of RAID-6 is that it allows for correction of up to two data element faults when the faults can be pinpointed through some independent means and has the ability to pinpoint and correct a single failure when the location of the failure is not known.
The present application appreciates that although single bit errors can often be swiftly corrected with little if any performance impact by conventional ECC mechanisms, storage systems suffer significant performance degradation when an error or failure is encountered that requires lost data to be recreated from multiple storage devices. For example, if a storage device failure occurs in a storage system employing a simple RAID3 parity scheme, recovery requires that corresponding data elements on each of the still functioning storage devices as well as the parity field needs to be read and then XORed together to reproduce the data from the failed storage device. In some conventional implementations, a complete rebuild of the contents of the failed storage device must be performed prior to any subsequently received storage access requests. In other conventional implementations, storage access requests continue to be serviced during the rebuilding process, but subject to significant bandwidth and/or performance degradation.
The present application appreciates that it would be highly useful and desirable to accelerate access to the contents of a data storage system following an error or failure.