In the digital age, organizations increasingly rely on digitally stored data. To protect against data loss, an organization may use a redundant storage system to store important data. Redundant storage systems may store more information than is necessary to recover the underlying data, such that even if some information is lost (e.g., from damage to a storage device) the storage system may still fully recover the underlying data.
For example, a redundant storage system may apply an erasure code to data to be stored on the system, creating a number (“n”) of data points to represent the data. Of the n data points, a certain number (“k”) may suffice to retrieve the original data from the storage system. Unfortunately, erasure codes traditionally used in redundant storage systems (such as Tornado codes and Reed-Solomon error correction techniques) may suffer from various deficiencies. For example, redundant storage systems using Tornado codes may only provide probabilistic guarantees that k data points will suffice to retrieve the original data, such that there may exist a set of critical data points whose loss may result in data loss. Furthermore, attempting to identify such sets of critical data points may be difficult or inefficient. Additionally, redundant storage systems using Tornado codes may not use space efficiently.
Redundant storage systems using Reed-Solomon error correction techniques may suffer from other deficiencies. For example, encoding the data to be stored into the data points and then decoding these data points back into the original data may be computationally intensive. Furthermore, the encoding and decoding tasks may scale poorly (e.g., with quadratic time complexity). Redundant storage systems using Reed-Solomon error correction techniques may also perform poorly when k is significantly smaller than n. Additionally, computational resources for decoding may increase with the number of erasures (e.g., data points missing from the original n data points). Accordingly, the instant disclosure identifies and addresses a need for efficient and reliable redundant data storage.