This invention relates to a distributed database constructed from a plurality of computers.
In recent years, data amounts have increased explosively in a computer system for executing an application using the Web, and there are known various systems for enhancing a performance of data access by distributing the data to a plurality of servers. For example, in a relational database management system (RDBMS), there is known a method involving dividing data for each predetermined range and allocating divided data to a plurality of servers, thereby enhancing an access performance of the entire system.
Moreover, as a system used in a cache server or the like, there is known a Not only SQL (NoSQL) database, such as key-value store (KVS).
In the KVS, various configurations are adopted. For example, there are known a configuration (memory store) in which data is stored in a volatile storage medium capable of accessing data at high speed, such as a memory, a configuration (disk store) in which data is stored in a non-volatile storage medium that is superior in durability of data storage, such as a solid state disk (SSD) or an HDD, and a configuration in which the above-mentioned configurations are used in combination.
The memory store and the disk store store therein a plurality of records in which data (value) and an identifier (key) of the data are linked as a pair.
In an in-memory distributed KVS, a cluster is constructed from a plurality of servers. The KVS is constructed on the memories in the servers included in the cluster. Such a configuration enables data to be accessed more quickly, and the system to be made available.
Each server constructing the distributed KVS stores data of a predetermined management range (e.g., a key range). Further, to ensure the reliability of the data in the distributed KVS, each server stores replicated data of the data included in the management range managed by another server.
Each server executes processing as a master server of the data included in the management range. In other words, in response to a read request including a predetermined key, the server managing the management range including the data corresponding to that key reads the data corresponding to the key. Further, each server operates as a slave server of replicated data of the management range managed by another server.
In the following description, data to be managed by the master server is also referred to as “master data”, and data to be managed by the slave server is also referred to as “slave data”.
Therefore, in the distributed KVS, even when a failure occurs in one server, another server holding replicated data of the master data of that server can continue processing as a new master server.
In the server constructing the distributed KVS, as described above, there is no such special server as a management server, and hence there is no single point of failure. Specifically, even when a failure occurs in any one of servers, another server can continue processing, and hence a computer system never stops. Accordingly, the distributed KVS can also ensure a fault tolerance.
It should be noted that the computer system can arbitrarily determine the number of slave servers, that is, the number of servers to which the replicated data is to be stored. The number of slave servers for one management range is hereinafter also referred to as “multiplicity”.
When one of the servers constructing the distributed KVS stops, the multiplicity of the distributed KVS decreases by one. If the number of servers that stop is equal to or more than the multiplicity of the distributed KVS, a business task using the distributed KVS cannot continue. It is therefore necessary to quickly reestablish the multiplicity of the distributed KVS. In the following description, reestablishing the multiplicity of the distributed KVS is referred to as “recovery”.
During the recovery of the distributed KVS, processing such as the following is executed.
First, processing for starting up a new server to serve as a replacement of the server in which the failure occurred is executed.
Second, data replication for writing the data held by the server in which the failure occurred to a new server is executed. Specifically, the server holding the replicated data of the data held by the server in which the failure occurred transmits the replicated data to the new server. At this stage, it is necessary for the replication source server and the replication destination server to hold the same data. Therefore, when the data held by the replication source server has been updated, the updated data needs to be written to the replication destination server.
Third, processing for adding the new server to the cluster is executed.
Examples of an application for utilizing the distributed KVS include online system commercial transactions, such as banking and Internet shopping. In order for the application to continue processing, recovery needs to be carried out without stopping the distributed KVS.