The amount of electronic data stored and accessed by storage devices increases daily. The number and type of storage devices used to store and access data also continues to expand. Even as miniaturization and advancing technology increase the sophistication, reliability, and capacity of storage devices including hard disk drives, (HDD), shingled magnetic recording (SMR) devices, and tape drives, improved efficiencies are constantly sought for these devices. Improved efficiencies are needed because users who store data have limited resources. Limited resources may include electricity, cooling capacity, or physical storage space associated with storage devices. In particular, electricity is limited and may be costly. Additionally, as more and more storage devices store and access more and more data, the power and resources required to operate those devices, and to maintain the facilities in which those devices are stored, continues to increase.
Data that is stored or transmitted may be protected against storage media failures or other loss by storing extra copies, by storing additional redundant information, or in other ways. One type of redundancy based protection involves using erasure coding. Erasure coding uses additional redundant data to produce erasure codes that protect against ‘erasures’. An erasure code (EC) allows data portions that are lost to be reconstructed from the surviving data. The application of erasure codes to data storage may typically have been for the purpose of recovering data in the face of failures of hardware elements storing the data. Some erasure codes may be simple to compute (e.g. systematic data) while other erasure codes may be more complex to compute (e.g. non-systematic data). The computational complexity may be a function of the approach used to compute parity data. For example, RS codes that compute parity data based on Galois Field arithmetic may have a higher computational cost than other approaches, including logical binary operations. Similarly, it may be simpler to recover data using some types of erasure codes in which data may already be available at the time of an access request (e.g. systematic data) and it may be more complex to recover data using other types of erasure codes (e.g. non-systematic data which requires decoding). However, conventional systems may compute erasure codes without considering the complexity of encoding the data. Conventional systems may also store erasure codes without considering the complexity of recovering the data. Thus, in conventional systems, the efficiency of encoding or recovering data based on the type of erasure code is not optimal. For example, conventional systems that store data and ECs on disk use sequential disk writes that do not consider the type of EC being written or the different energy requirements for reading or writing data at different zones on a disk.
Adding redundancy introduces overhead that consumes more storage capacity or transmission bandwidth, which in turn adds cost and may increase energy consumption. The overhead added by erasure code processing tends to increase as the protection level increases. Ideally, the redundant information may never need to be accessed, and thus conventional systems may group all redundancy data together and store it in some out of the way place. This one-size-fits-all approach may produce sub-optimal results, particularly concerning energy conservation.