In recent years, the amount of data processed in a computer system which executes applications using a Web has increased rapidly and various systems which improve data access performance by distributing data to a plurality of servers have been known. For example, in a RDBMS (Relational DataBase Management System), a method which splits data into predetermined ranges (for example, key ranges) and distributes the split data to a plurality of servers to thereby improve the access performance of the entire system has been known.
A NoSQL (Not Only SQL) database such as a KVS (Key Value Store) has been known as a system used in a cache server or the like. The KVS stores a plurality of records each having a pair of data (value) and a data identifier (key).
The KVS has various structures, such as a structure (memory store) in which data is stored in a volatile recording medium (for example, a memory) in which data can be accessed at a high speed, a structure (disk store) in which data is stored in a nonvolatile recording medium (for example, an SSD (Solid State Disk) or an HDD) having excellent data storage permanence, and a combination of the structures.
An in-memory KVS realizes faster data access than a disk-type KVS but has some drawbacks. First, due to physical restrictions or the like, the memory volume which can be mounted in one server of the in-memory KVS is smaller than an SSD or an HDD and the amount of data that can be stored in the in-memory KVS is smaller than the disk-type KVS. Second, since the memory is a volatile recording medium, data in the memory is erased when the server stops due to a certain failure.
An example of a system which overcomes the drawbacks is an in-memory distributed KVS (hereinafter, referred to as a distributed KVS). The distributed KVS is a KVS which is formed on memories of servers included in a cluster which is formed by a plurality of servers. For the first drawback, it is possible to secure a memory volume which cannot be obtained by one server by integrating the memories on a plurality of servers. For the second drawback, it is possible to avoid erasure of data even when some servers stop by copying same data between a plurality of servers.
Each server which forms the distributed KVS manages a range which does not overlap the ranges of other servers and stores an aggregate (hereinafter, referred to as a partition) of data included in the range. Furthermore, each server stores the copies of partitions which are managed by other servers.
Since a special server such as a management server is not present or is multiplexed in the distributed KVS, a single point of failure does not occur in the distributed KVS. That is, even when a failure occurs in an arbitrary server, since other servers can continuously perform a process on the basis of the copied partitions, the computer system may not stop. Therefore, fault-tolerance of the distributed KVS is secured.
When the amount of data to be stored in the distributed KVS increases, a load applied to the server increases or the space for storing data disappears. One of the measures for solving the problem is a scale-out where a new server added to a cluster. In the scale-out, a process (hereinafter, referred to as a rebalancing process) of changing the range managed by the server is performed. Moreover, when data is distributed to some servers only in an unbalanced manner, the throughput of the system decreases. As one of the measures for solving the problem, the rebalancing process is performed to solve the data unbalance problem.
Techniques related to the above-mentioned rebalancing process are disclosed in PTL 1 and 2. In PTL 1, it is described that a rebalancing process can be realized by performing a process of preparing a new partition after rebalancing for an existing range and an existing partition corresponding to an existing range, copying data from the existing partition to the new partition, and switching access from the existing range and the existing partition corresponding to the existing range to the new range and the new partition corresponding to the new range after the copying is completed.