Data servers capable of storing massive amount of data are used in various networks, in particular for storing the fast-growing quantity of data generated by the ever-increasing number of social networks users, or for addressing the needs of cloud network operators for managing customer data stored in the so-called “cloud”. Such data centers typically include one or several data storage nodes, wherein data is stored, with the requirement that data shall be available at all time, that is, data shall be retrievable at all time. Such requirement implies that data loss or data corruption are unacceptable, which has led to security solutions consisting for the most part in the replication of stored data, with a replication factor generally equal to three but which may reach in some cases a value as high as seven.
Data replication solutions with a high replication factor are particularly sub-optimal when used with massive amount of data in that they severely increase the required data storage space and cost of associated hardware, not even mentioning the carbon footprint associated thereto. The severity of this energy and hardware cost issue and, as a consequence, the storage total cost, have been decreased through use of erasure coding techniques, such as Reed-Solomon coding.
Erasure coding generates redundancy of encoded data, the size of which is reduced as compared to strict replication of data.
The use of Reed-Solomon coding for data storage applications is discussed in “Erasure Coding vs. Replication: A Quantitative Comparison”, H. Weatherspoon and J. D. Kubiatowicz, in Proceedings of the first International Workshop on Peer-to-Peer Systems (IPTP), 2002.
The execution of erasure coding and decoding algorithm when storing and retrieving data, respectively, generates latency in data storage or retrieval which should be minimized in order to leverage the full benefits of use of erasure coding in data storage solutions. This latency is increased further at the decoding stage in case of data erasure wherein erased data has to be reconstructed for complete retrieval of stored data.
There remains a need therefore for improved erasure coding and decoding algorithms, with respect to their algorithmic complexity and latency performances, in particular at the decoding stage.