Data files, such as but not limited to pictures, documents, spreadsheets, videos, sound recordings, presentations, scans, etc., are important to businesses, governments, and individuals. These data files are often stored on personal computers, laptops, small (local) servers, or other similar devices. Lightning, fire, flood, theft, a hard drive error or failure, or other unfortunate event can, however, render some or all of the data difficult or impossible to recover from those devices.
To provide for recovery of data in the event that it is no longer available from the various types of devices mentioned above, it is not uncommon for the data to be backed up via a distributed storage system, which may include, for example, different media and/or multiple servers, which may be geographically distributed and/or on, for example, a server farm. The different media, however, can be lost, stolen or damaged, or a server might suffer a problem, such as a hard drive error or failure.
One approach for providing for redundancy in storage of the data files is placing a copy of a data file on multiple server computers. This approach works, but can be expensive as the total storage space required is the size of the original data file multiplied by the number of servers on which the file is stored. Another approach is to use erasure coding, with the erasure coded information being stored on the several servers. Erasure coding can reduce the size of the storage space required for reliable storage of a file when compared to the space required for storing a complete copy of the data file on multiple servers.
Erasure coding breaks a data file into n fragments and generates a mathematical construct for each fragment. Not all n fragments are required, however, in order to reconstruct the original data file. For example, in an m-of-n erasure coding scheme, a data file is broken into n fragments, each fragment being 1/m the size of the data file, the fragments are encoded, and different fragments may be placed on different servers. When it is desired to reconstruct the data file, only m non-identical fragments are required to accurately construct the original data file.
If a data file is reconstructed from several fragments, but one of the fragments has become corrupted, the reconstructed data file generally will not be a faithful reconstruction of the original data file. Such corruption can often be detected by comparing checksums of the original data file and the reconstructed data file. For example, once a data file has been reconstructed from fragments, the checksum for the original file is retrieved and compared with the checksum for the reconstructed data file. If the checksums match, then the reconstructed data file is most likely a faithful reconstruction of the original data file. If the checksums do not match then corruption has occurred. The corruption, however, may be in the reconstructed data file, or may be in the original checksum that was stored and retrieved.
If corruption is detected, then one approach is to keep trying combinations of fragments until a combination yields a reconstructed data file with the correct checksum. This, however, may involve substantial computing time as there is potentially a factorial combination of fragments to try to reconstruct the block, that is,
      (                            n                                      m                      )     combinations. For example, in a 3-of-6 erasure coding scheme, there are 20 possible fragment combinations, in a 4-of-7 scheme there are 35 possible combinations, and in a 5-of-10 scheme there are 252 possible combinations. Some systems use large n values, 20, 40, or higher, and so the number of possible combinations becomes very large, and the approach of trying all possible combinations quickly becomes impractical. Also, if the corruption is in the original checksum value, then it is likely that none of the numerous candidate files will provide a checksum that matches this corrupted checksum, even if one, more, or all of the candidate data files is actually valid.
It is with respect to these and other considerations that the disclosure presented herein has been made.