This invention relates to a fault-tolerant computer system, and more particularly to a technique of adding a standby computer to the computer system.
In database systems, reliability can be ensured by a cluster configuration having a plurality of servers which include an active server for executing data processing and a standby server for taking over the data processing when a failure occurs in the active server. When a disk-based database (DB) is used, data is taken over by using a shared disk which can be referred to by the active server and the standby server, and the processing is continued by the standby server.
On the other hand, when an in-memory database is used, data is held in a memory included in each of the servers. Therefore, it is impossible for the servers to share a storage device to share the data. For this reason, the standby server holds, in its memory, a replica of database data of the active server, thereby making the data redundant.
When a failure occurs in a server and the server is separated from the cluster, data redundancy is reduced. If the system is kept running for a long time with the reduced data redundancy, a further failure may occur to cause system halt or data loss. Therefore, an online system in which a continuous operation is demanded needs to be prevented from being operated while data redundancy is insufficient. It is necessary that the data redundancy is recovered by restoring the server in which the failure has occurred, to the cluster or by adding a new server to the cluster, to continuously operate the system in a stable manner.
JP 2005-251055 A discloses a technique of promptly recovering redundancy, in which devices that form a pair monitor each other, and, when one of the devices detects a failure occurred in the other, a new pair is automatically determined.
In a cluster configuration in which database data is held in data areas independently by the active server and the standby server, when a new standby server is added, it is required to synchronize data held by the new standby server with data held by the active server. The data synchronization indicates a state in which the data held by the active server can be generated by the standby server. Specifically, the standby server may hold data identical to the data held by the active server or different data from which the data held by the active server can be generated. In the latter case, for example, when the standby server holds an update log for database data held by the active server, the database data held by the active server can be generated by the standby server.
However, to make a replica of the database data held by the active server in the new standby server ensuring data consistency, update processing in the database of the active server needs to be temporarily stopped (a frozen image of the database of the active server needs to be created). In that case, it is also necessary to temporarily stop processing of requesting the active server to perform the database update processing.
JP 2005-251055 A does not describe making a replica of data in a newly-added device. US 2006/0253731 A discloses a technique, in which, in order to make a backup of data without disturbing an operation of an active server, first and second standby servers are prepared, and, when a database of the active server is synchronized with a database of the first standby server, update of the first standby server is stopped and the data of the first standby server is replicated to the second standby server.