In a distributed storage system, for the purpose of achieving reliability, a file-level multi-copy redundancy technology is adopted, or a data-block-level redundancy coding technology, for example, an erasure coding technology, is adopted. In the multi-copy redundancy technology, the probability of data loss may be reduced by storing multiple same copies for one data file, and in the redundancy coding technology, reliability may be improved by adding a check block for any partial data in a file.
Generally, a distributed hash table (DHT) may be adopted to store a data block and a check block. However, because of randomicity of the DHT, it cannot be avoided that multiple data blocks of a same data slice are deployed on a same physical storage node, and therefore it cannot be avoided that invalidity of a single physical storage node (for example, a rack, a server, or a hard disk) results in a risk of data loss. For example, an M+N erasure coding technology is adopted, where M is the number of data blocks, and N is the number of check blocks, and when more than N+1 data blocks or check blocks are deployed on a same hard disk, a failure of the hard disk may result in a loss of the M data blocks, and therefore may result in unavailability of the whole file. Using a 12+3 redundancy storage mechanism as an example, when more than 4 data blocks are lost, a data slice may be lost and cannot be restored.
In other words, in an existing distributed storage system, a single-point (for example, a hard disk, a server, or a rack) failure may result in a data loss, and the risk and probability of the failure are extremely high especially when the scale of the distributed storage system is relatively small, thereby reducing the reliability of the distributed storage system.