In a storage system (such as a disk system or memory subsystem of a computer) it is common to replicate or mirror the storage to continue operation after failure; in a memory subsystem this is referred to as Memory Mirroring, and on disk storage systems as RAID 1.
It is recognized that disks are an inherently unreliable component of computer systems. Mirroring is a technique to allow a system to automatically maintain multiple copies of data so that in the event of a disk hardware failure a system can continue to process or quickly recover data. Mirroring may be done locally where it is specifically to cater for disk unreliability, or it may be done remotely where it forms part of a more sophisticated disaster recovery scheme, or it may be done both locally and remotely, especially for high availability systems. Normally data is mirrored onto physically identical drives, though the process can be applied to logical drives where the underlying physical format is hidden from the mirroring process. Typically mirroring is provided in either hardware solutions such as disk arrays or in software within the operating system.
In working storage systems it is not unusual to find small numbers of errors in the values read back from storage. In the case of random access memory (RAM), errors occur due to failed cells and temporary failures due to the interaction of alpha particles or cosmic rays within the RAM. To deal with these rare errors, systems include Error Correction Codes (ECC).
ECCs store some extra bits of data as a digest of a block of storage. When reloading the data the ECC (Ec) is recalculated from the loaded data (Dr) and compared with the ECC digest read from storage (E). If they differ, the ECC can indicate (for some errors) which bit to toggle to recover the original value.
Any particular ECC system has a limit to the number of errors that it can detect and how many errors it can correct in a given block of storage. For example, an ECC system may guarantee to hold enough information to correct a single bit error or detect pairs of errors. In such a system, if 3 bits in the block are corrupt, the ECC may or may not detect it, and if not detected, the system has no way of differentiating between it and a correct value.
In a system with both mirroring and ECC, the two are typically independent—in the sense that each side of the mirror has ECC, and if the ECC detects an uncorrectable error the only option is to use the data from the other mirrors.
Referring to FIG. 1, a schematic representation shows a mirrored data system 100 including a first mirror 110 and a second mirror 120.
The first mirror 110 a data set D1 111 which is stored in a storage medium. An ECC algorithm 130 is applied to the data set D1 111 to produce an ECC value E1 112.
Similarly, the second mirror 120 a data set D2 121 which is stored in a storage medium. The same ECC algorithm 130 is applied to the data set D2 121 to produce an ECC value E2 122.
Recovered data 151 in the first mirror 110, includes recovered data set Dr1 113 which is read from the storage medium and should be the same as the data set D1 111 (shown by hashed line). The recovered data set Dr1 113 has the ECC algorithm 130 applied to it to produce an expected ECC value Ec1 114. A recovered ECC value Er1 115 is also read from the storage medium and should be the same as the ECC value E1 112 (shown by hashed line).
Recovered data 152 in the second mirror 120, includes recovered data set Dr2 123 which is read from the storage medium and should be the same as the data set D2 121. The recovered data set Dr2 123 has the ECC algorithm 130 applied to it to produce an expected ECC value Ec2 124. A recovered ECC value Er2 125 is also read from the storage medium and should be the same as the ECC value E2 122 (shown by hashed line).
Each mirror has a set of Data (D1, D2) and a set of ECC values (E1, E2). The same algorithm is used for both mirrors so that E1=ECC(D1) and E2=ECC(D2). On read, the recovered data (Dr1, Dr2) is used to calculate the expected ECC values Ec1=ECC(Dr1), Ec2=ECC(Dr2). If Ec1=E1 then D1 is valid, if Ec2=E2 then D2 is valid. A mismatch indicates either the data or the ECC data is corrupt. If either one matches, that data is assumed to be correct. If both match then the choice is arbitrary. Errors that the ECC does not detect will allow the corrupted data to be read.
It would be possible to also compare Dr1 and Dr2 to detect errors. If Dr1 and Dr2 do not match (even if the error correction passed), then an ECC undetected error has been detected but cannot be corrected since it is not possible to know which of Dr1 and Dr2 is correct.
As the size of compute clusters grows and storage sizes increase the number of errors in the whole system increases.