A distributed data storage system generally utilizes multi-replica mode for storing data so as to improve the reliability of data storage. The physical topology of a storage device is generally hierarchical, referring to FIG. 1 which is a diagram of the physical topology of a storage device in a distributed data storage system relating to the present invention. As shown in FIG. 1, the distributed data storage system is provided in a data center 10 which consists of three machine rooms M1, M2, and M3, there are several racks provided in each of the three machine rooms M1, M2, and M3, for example, in machine room 2 (M2) racks 1, 2, . . . , and N1 (R1, R2, . . . RN1) are deployed. Further, on each of the racks, a plurality of hosts (computers) are provided, for example, hosts 1, 2, . . . , and N3 (H1, H2, . . . HN2) are provided on rack 1 (R1). In each host, a plurality of storage medium devices (generally hard disks) are provided, for example, hard disks 1, 2, . . . , and N3 (HD1, HD2, . . . HDN3) are provided in host 2 (H2). For clarity, FIG. 1 only shows a part of devices. It can be seen that the distributed data storage system has a tree structure, and the storage medium devices are positioned on leaf nodes, and the hosts, racks and machine rooms are intermediate nodes.
Distributed data storage systems can be divided into two modes, one with center nodes and the other without center nodes. Generally, in the distributed data storage system with a center node, a client, the center node and a storage node are included, data are processed into blocks and stored in multiple replicas. For positions for storing the data replicas, positions of distributing the data replicas are decided by the center node according to the load conditions of the storage node and the storage strategy of the replicas. The center node can either be host-backup configuration of two servers or be a cluster of servers. In a distributed data storage system without a center node, each of the nodes is connected with each other, data are distributed randomly on storage devices of the nodes, and the positions of storage can be obtained by a node using the hash function.
For the distributed data storage system with a center node, when data are read therefrom, it first needs to access the center node in order to acquire the positions of storing the data, which makes the center node overloaded and the processing efficiency reduced, thereby forming the bottleneck of the whole system and reducing the system performance. Furthermore, in a case that there is a failure in the center node, the bottleneck will be aggravated, and even the whole distributed data storage system becomes unavailable and has relatively low reliability.
The distributed data storage system without a center node can avoid the aforesaid bottleneck, but the reliability thereof is still insufficient. For example, if there is a power failure or network failure in some rack and all replicas of some data are in hard disks of hosts of the rack, the data cannot be acquired. Besides, when the number of devices changes, for example, adding one hard disk or one host, a wide range of data migration will inevitably occur in the distributed data storage system without a center node. This is illustrated by the following simple example.
For example, the distributed data storage system without a center node has 5 nodes (far more than 5 actually), the hash value of one data to be stored that is calculated by the hash function is 13, taking modulo by the number of the nodes, it is 3, and then the data to be stored is saved in node 3; when the devices are increased to make the number of the nodes be 6 and the data is read, taking modulo by node number 6 of hash value 13, it is 2, i.e., reading data from node 2. Now, the data will migrate from node 3 to node 2 firstly. When the number of nodes changes, the results of modulo will be certainly different, so that data migration will inevitably occur when data stored before the change of the number of nodes are read after the change thereof. Consequently, once the number of nodes changes, data migration is common; this will cause the efficiency of the system and the lifetime of the storage medium reduced.