Network systems and storage devices need to reliably handle and store data and, thus, typically implement some type of scheme for recovering data that has been lost, degraded or otherwise compromised. At the most basic level, one recovery scheme could simply involve creating one or more complete copies or mirrors of the data being transferred or stored. Although such a recovery scheme may be relatively fault tolerant, it is not very efficient with respect to the need to duplicate storage space. Other recovery schemes involve performing a parity check. Thus, for instance, in a storage system having stored data distributed across multiple disks, one disk may be used solely for storing parity bits. While this type of recovery scheme requires less storage space than a mirroring scheme, it is not as fault tolerant, since any two device failures would result in an inability to recover any compromised data.
Thus, various recovery schemes have been developed with the goal of increasing efficiency (in terms of the amount of extra data generated) and fault tolerance (i.e., the extent to which the scheme can recover compromised data). These recovery schemes generally involve the creation of erasure codes that are adapted to generate and embed data redundancies within original data packets, thereby encoding the data packets in a prescribed manner. If such data packets become compromised, as may result from a disk or sector failure, for instance, such redundancies could enable recovery of the compromised data, or at least portions thereof. Various types of erasure codes are known, such as Reed-Solomon codes, RAID variants, array codes (e.g., EVENODD, RDP, etc.), low-density parity check codes (e.g., Tornado codes, Raptor codes, rateless codes, etc.) and XOR-based erasure codes. However, encoding or decoding operations of erasure codes often are computationally demanding, typically rendering their implementation cumbersome in network systems, storage devices, and the like.
In addition, determining the fault tolerance of a particular erasure code, and thus the best manner in which to implement a chosen code can be challenging. For instance, fault tolerance determinations often do not factor in the fault tolerance of the devices themselves, thus leading to imprecision in assessing the actual fault tolerance of the recovery scheme. Thus, efforts to select an optimal erasure code implementation for a particular system could be impeded. Further, uncertainty regarding the fault tolerance of a particular code can impact the manner in which data is allocated among various storage devices and/or communication channels. Such uncertainty could hamper a user's ability to optimally store and/or allocate data across storage devices. Similarly, such uncertainty also could hamper efforts to allocate and route data across communication network channels, inasmuch as those systems could not function as desired.