Cloud storage refers to a system that uses functions such as a cluster application, a grid technology, or a distributed file system to integrate a large quantity of various storage devices on a network by using application software to work collaboratively and provide data storage and service access functions externally. At present, there are two cloud storage architectures: a centralized architecture represented by a Google file system and a decentralized mass distributed storage architecture (a peer-to-peer storage architecture) that is based on the peer-to-peer (Peer to Peer, P2P) technology.
A logical volume in a peer-to-peer storage system is virtual storage space formed by storage blocks of a certain size and externally provides block storage services. An underlying storage engine mainly relies on a distributed hash table (Distributed Hash Table, DHT) ring constructed by using the DHT technology. The DHT ring integrates the entire storage space by using the virtualization technology and transparently provides a logical volume service with a certain capacity for an upper layer. The DHT ring is logically segmented into N partitions, and a data block stored in each storage block in a logical volume is allocated to a certain specified partition in the DHT ring by using the hash algorithm. To improve data storage reliability, two copies of the data block are stored in two adjacent partitions behind the specified partition (the number of copies used for backup is adjustable), which may be considered that the data block is mapped to the three partitions, that is, the data block is mapped to a group of logically consecutive partitions. Because each data block and a copy thereof are separately stored in three consecutive partitions, storage reliability for the three consecutive partitions needs to be ensured. The prior art only ensures that the three partitions are not mapped to the same physical disk, that is, three physical disks are randomly selected among all physical disks and are used as a physical disk combination to store the three consecutive partitions.
In the foregoing process, the inventor finds that the prior art has at least the following problems:
The prior art uses a random-mapping policy. When the number of partitions is relatively very large when compared with the number of physical disks, at least one group of three consecutive partitions are placed in each physical disk combination that is randomly selected, so as to balance data in the physical disks and output/input traffic. When any group in the physical disk combination is faulty, a set of data blocks and two sets of copies thereof are also lost, which results in low data reliability. For example, when the total number of physical disks is 5, the number of combinations of three physical disks is 10, which is obtained by taking 3 from 5. When any group of the 10 physical disk combinations is faulty, that is, three physical disks are faulty, a data block and two sets of copies thereof stored on the faulty physical disks are also lost. Even when the number of partitions is less than the number of physical disks, there is a high probability of losing both the data block and the two sets of copies thereof due to a fault of a physical disk combination, and a technical issue of low data reliability may also be caused.