With the rapid development of computer network applications, the amount of network information is larger and larger, and thus massive information storage is particularly important. The traditional file storage system can no longer meet the requirements of high capacity, high reliability, high performance and the like in the current applications. The distributed storage system has become an effective system for massive data storage with high extensibility and high availability. However, data storage nodes in the distributed storage system are unreliable, and thus redundancy is required to be introduced into the storage system for the unreliable storage nodes to provide reliable storage service. The simplest method for the introduction of redundancy is to directly backup raw data. Although the direct backup method is simple, the storage efficiency and system reliability are not high. The method for introducing redundancy by encoding can improve the storage efficiency. In the current storage systems, the encoding method generally adopts MDS (Maximum Distance Separable) codes. The MDS codes can achieve optimum storage space and efficiency. As for an (n, k) MDS EC (Erasure Code), an original file is required to be divided into k fragments equally; n unrelated encoded fragments are generated via linear encoding; different fragments are stored into n nodes; and the MDS property (the original file can be reconstructed by any k encoded fragments among the n encoded fragments) can be met. The encoding technology is important in providing efficient network storage redundancy, especially in large file storage and file data backup application.
In the distributed storage system, data with the size B is stored into n storage nodes, and the size of data stored into each storage node is α. Data receivers can reconstruct the raw data B only by connecting and downloading the data in any k storage nodes among the n storage nodes, which is known as the “data reconstruction process”. RS (Reed-Solomon) code is a code which meets the characteristics of the MDS codes. When a storage node in the storage system fails, the data stored into the failed node must be repaired and stored into a new node in order to maintain the redundancy of the storage system, which is known as the “repair process”. However, in the repair process, as for the RS code, the data in the k storage nodes must be downloaded at first and then the raw data must be reconstructed, and subsequently the storage data of the failed node must be encoded into the new node. The method for decoding the whole raw data in order to restore the data in one storage node obviously wastes bandwidth.
Moreover, in the case of system node failure or file loss, the system redundancy may be gradually reduced over time. Therefore, a mechanism is required for guaranteeing the system redundancy. ECs (Erasure Codes) put forward in the literature [R. Rodrigues and B. Liskov, “High Availability in DHTs: Erasure Coding vs. Replication”, Workshop on Peer-to-Peer Systems (IPTPS) 2005.] are relatively effective in reducing the storage overhead but also have relatively high communication overhead required for supporting redundancy recovery. FIG. 1 indicates that a original file can be acquired from the available nodes as long as the number of effective nodes in the system d is more than or equal to k, namely d≧k. FIG. 2 indicates the process of restoring the content stored in a failed node. As illustrated in FIGS. 1 and 2, the whole recovery process comprises the following steps of: 1) firstly, downloading data from k storage nodes in the system and reconstructing the original file; and 2) recoding a new fragment based on the original file and storing the new fragment into a new node. The recovery process indicates that the network load required for repairing any failed node is at least the content stored in the k nodes.
Meanwhile, in order to reduce the bandwidth used in the repair process, the literature [A G Dimakis P G Godfrey, M J Wainwright, K. Ramchandran, “Network Coding for distributed storage systems”, IEEE Proc. INFOCOM, Anchorage, Ak. , May 2007.] puts forward RGCs (Regenerating Codes) in virtue of the network coding theory, and the RGCs also meet the characteristics of the MDS codes. During the RGC repair, new nodes must be connected with d storage nodes among residual storage nodes and respectively download the data with the size β from the d storage nodes, so the RGC repair bandwidth is dβ. Simultaneously, a RGC functional repair model is provided. In addition, two types of optimum codes for the RGCs are provided, namely MSR (Minimum-storage Regenerating) codes and MBR (Minimum-bandwidth Regenerating) codes. The repair bandwidth of the RGCs is superior to that of RS codes, but the RGC repair process requires the connection of d(d>k) storage nodes (d is known as “helper nodes”). Moreover, the helper nodes must execute random linear network coding operation on the data stored into the helper nodes. In order to meet the requirement of mutually independent encoded packets, the RGC computation must be executed within a large finite field.
The patent PCT/CN2012/083174 provides a method for encoding PPSRCs (Practical Projective Self-repairing Codes), and reconstructing and repairing data. The PPSRCs also have two typical properties of self-repairing codes: one is that missing encoded fragments can download the data, of which the size is less than that of the whole file, from other encoded fragments for repair; and the other is that the missing encoded fragments are repaired from a specified number of fragments, wherein the specified number is only related to the number of the missing fragments and not related to which fragments are missing. Due to the properties, the load for repairing one missing fragment is relatively low. In addition, due to the same status and balanced load of various nodes in the system, different missing fragments can be independently and concurrently repaired at different positions of the network.
Except for meeting the above conditions, the code also has the characteristics that: when one node fails, (n−1)/2 pairs of repair nodes are available for selection; and when (n−1)/2 nodes fail at the same time, two nodes among the residual (n+1)/2 nodes are available for repairing the failed nodes.
The PPSRC encoding and self-repairing process only involves XOR (Exclusive OR). As for general self-repairing codes, the encoding process involves polynomial arithmetic and is relatively complex. The computation complexity of the PPSRCs is less than that of PSRCs (Projective Self-repairing Codes). Meanwhile, the repair bandwidth and repair nodes of the PPSRCs are superior to those of the MSR codes. Moreover, as the redundancy is controllable, the PPSRCs are applicable to general storage systems. And optimum reconstruction bandwidth of the PPSRCs can be achieved.
In summary, the PPSRCs have the advantages of effectively reducing the number of data storage nodes, reducing the redundancy of system data storage, and greatly improving the use value of the PSRCs.
However, the PPSRCs also have the following disadvantages. Firstly, the encoding and decoding processes of the PPSRCs are relatively complex; the division operand of finite fields and subdomains thereof is relatively large; and the data reconstruction process is relatively complex. Secondly, in the PPSRCs, encoded fragments are inseparable, and thus the repair of the encoded fragments must also be inseparable. Thirdly, as the computation complexity of the whole encoding and decoding processes of the PPSRCs is relatively high, the redundancy is comparatively high although controllable. In general, the number of storage nodes of the PPSRCs is very large, so the PPSRCs are completely unnecessary for relatively small files. Therefore, the PPSRCs are more difficult to implement in the practical distributed storage systems, and thus have low versatility.
The patent PCT/CN2012/071177 provides a method of RGCs. In the proposal, a missing encoded fragment can be repaired by utilizing a small amount of data but not by reconstructing the whole file. The RGC utilizes the linear network coding theory and the NC (Network Coding) property (namely max-flow min-cut) to improve the overhead required for repairing an encoded fragment. It can be proved from the network information theory that the missing fragment can be repaired by using the network overhead with the same data size with the missing fragment.
The main idea of the RGCs is to utilize the MDS property. When some storage nodes in the network fail, it means that the storage data is missing, and then the information is required to be downloaded from available effective nodes to repair the missing data fragment, and then the data is stored into a new node. A plurality of original nodes may fail over time, and certain regenerated new nodes can re-execute the regeneration process based on the regenerated novel nodes and generate more new nodes. Therefore, the regeneration process must ensure two things: 1) the failed nodes are mutually independent and the regeneration process can be recursive; and 2) the primary file can be restored via any k nodes.
FIG. 2 illustrates the regeneration process when a node fails. In a distributed system, the data with the number of α is stored into n storage nodes respectively. When a node fails, a new node downloads the data from other d≧k active nodes and uses the data for node regeneration. The download of each node is β. A pair of nodes Xiin, Xiout are used for representing each storage node i and are connected with each other via an edge of which the capacity is the storage capacity (namely α) of the node. The regeneration process is illustrated by an information flow graph, wherein Xin respectively acquires the data with the number of β from any d active nodes in the system; the data with the number of α is stored into Xout via
            X              i        ⁢                                  ⁢        n              ⁢          ⟶      α        ⁢          X      out        ;and any data collector can access Xout. The maximum information flow from information source to information sink is determined by the minimal cut set in the figure. When the information sink requires the reconstruction of the original file, the size of the flow cannot be less than that of the original file.
There is a tradeoff between the storage capacity α of each node and the bandwidth γ required for the regeneration of a node. Therefore, the MBR codes and the MSR codes are introduced. As for the minimum storage nodes, at least M/k bits are stored into each node, and then the
      (                  α        MSR            ,              γ        MSE              )    =      (                  M        k            ,              Md                  k          ⁡                      (                          d              -              k              +              1                        )                                )  in the MSR codes can be deduced. When d adopts the maximum value, namely when a new node connected all the active n−1 nodes simultaneously, the repair bandwidth γMSR is minimum, namely
      γ    MSR          m      ⁢                          ⁢      i      ⁢                          ⁢      n        =            M      k        ·                            n          -          1                          n          -          k                    .      Moreover, as the MBR codes have minimum repair bandwidth, it can be deduced that the minimum repair load
      (                  α        MBR                  m          ⁢                                          ⁢          i          ⁢                                          ⁢          n                    ,              γ        MBR                  m          ⁢                                          ⁢          i          ⁢                                          ⁢          n                      )    =      (                            M          k                ·                                            2              ⁢              n                        -            2                                              2              ⁢              n                        -            k            -            1                              ,                        M          k                ·                                            2              ⁢              n                        -            2                                              2              ⁢              n                        -            k            -            1                                )  can be achieved when d=n−1.
As for the problem of failed node repair, three repair models are taken into consideration: firstly, exact repair, namely a failed fragment is required for correct construction and the information restored is the same with the missing one (the core technology is interference queue and NC); secondly, functional repair, namely a new fragment can contain the data which is different from that of the missing node as long as the repaired system supports the MDS code property (the core technology is NC); and thirdly, partial system exact repair, which is a hybrid repair model between exact repair and functional repair, wherein in the hybrid model, system nodes (used for storing uncoded data) are required for exact repair, namely the information restored must be the same with the information stored in the failed node, and non-system nodes (used for storing encoded fragments) are not required for exact repair and only required for functional repair as long as the information restored can meet the MDS code property (the core technology is interference queue and NC).
In order for the RGCs to be applied to practical distributed systems, the missing fragment can only be repaired by downloading data from at least k nodes even if not optimal. Therefore, even though the data transmission quantity required in the repair process is relatively low, the RGCs require high protocol load and system design (NC technology) complexity. Moreover, as the engineering solution such as the idle repair process is not considered in the RGCs, the case of load repair due to temporary failure cannot be avoided. Furthermore, the computational overhead required for the implementation of encoding and decoding of the NC-based RGCs is relatively large and is an order of magnitude higher than the traditional ECs.