A failover cluster system is known, which is equipped with both an active server and a standby server and a shared disk to prevent the stoppage of processing due to occurrence of a computer failure, for example, as shown in Patent Document 1. In such a failover cluster system, data of a business application running in each of the servers are stored in the shared disk that can be accessed from both the active server and the standby server. When a failure occurs in the active server, cluster software starts the business application in the standby server by executing switchover (referred to as “failover” hereinafter) of the business application. The business application started in the standby server uses the data stored in the shared disk to restart business processing from a time point of stoppage of the business application in the active server.
Since a failover cluster system equipped with a shared disk stores data in the shared disk as mentioned above, there is a fear that the data is destroyed if both the active server and the standby server write into the shared disk simultaneously. Therefore, such a system executes exclusive control so as to cause only the active server to write usually.
(1: Exclusive Control at the Time of Failover)
For executing failover from an active server to a standby server, it is required to start a business application in the standby server after certainly stopping the active server from accessing a shared disk.
(2: Exclusive Control in the Split-Brain State)
There is a structure (a split-brain solution method) that, when a cluster goes into a split-brain state in which the cluster cannot recognize the statues of servers due to a failure of a network between the servers, a server to which the cluster should switch a business application is decided based on the presence of access to a shared disk. This split-brain solution method possibly invites a situation that, when a server with a fault network device accesses the shared disk, the cluster falsely recognizes the fault server is normally operating. Thus, it is required to securely prevent the fault server from accessing the shared disk.
As stated above, when executing failover in a cluster system, it is required to properly execute exclusive control to certainly stop a fault server from accessing a shared disk.
[Patent Document 1] Japanese Unexamined Patent Application Publication No. JP-A 2012-173752
As a method for stopping access to a shared disk in a cluster system, unmounting a disk, making the port of a FC (Fibre Channel) switch inactive, causing an OS (Operating System) panic, or the like, is used. However, each of the abovementioned methods has a problem as described below.
Unmounting a disk has a problem that it takes time to execute and, when a process being written in exists, unmounting the disk fails. Making the port of a FC switch inactive has a problem that it takes time to connect because a server connects to a module outside the server and the server cannot connect to the FC switch depending on the type of a failure. Causing an OS panic has a problem that an input/output remaining in the cache of a HBA (Host Bus Adapter) card may be written in.
Further, in the case of executing I/O fencing that prevents writing into a shared disk in a HW (Hardware) layer, supply of power to the card is shut off, for example. Executing I/O fencing in the HW layer realizes secure I/O fencing, but there is no means for notifying the completion of I/O fencing to a standby server. Therefore, for example, the standby server needs to wait for heart beat communication timeout to start failover, and there is a problem that it is impossible to rapidly execute failover.
Thus, there has been a problem that it is impossible to speedily execute failover while protecting shard data in a failover cluster system.