Data corruption in storage environments can have many different causes, such as hardware, network, disks, environmental, radiation, electrical, software and more, all leading to a data error in client applications. In today's data environment where more and more focus is on distributed data and applications, this problem moves from more secure data centers (DCs) to small Internet of Things (IoT) devices and Internet. To mitigate problems with data errors, DCs replicate the data over several DC sites to have copies of the data available at all times. However, replicating copies of data creates time gaps between the data copies and multiplies the amount of data and also creates a lot of extra work for the DCs to maintain all the data.
The introduction of Forward Error Correction (FEC) codes greatly improved this situation in DCs for the handling of Redundant Array of Inexpensive Discs (RAID). However, the present Reed-Solomon FEC code and similar FEC codes are not well suited for distributed storage handling tomorrow's needs for widely distributed storage.
There is a great need for a high performance FEC code with built in Data Error detection And Data error Correction (DEADC) adapted for widely distributed storage solutions achieving end-to-end data integrity.
The paper David Fiala, Frank Mueller, Christina Engelmann, Rolf Riesen, Kurt Ferreira, Ron Brightwell, Detection and Correction of Silent Data Corruption for Large-Scale High-Performance Computing, SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, Article No. 78 discusses that faults have become the norm for high-end computing clusters. The paper further discloses that even a single error can have profound effects on applications by causing a cascading pattern of corruption, which in most cases spread to other processes.
The proposed technology aims to at least mitigate some of the problem related to data corruption.