Mission-critical systems are information processing systems that involve high reliability, failure tolerance, and availability, and typically continue to operate 24 hours a day, 365 days a year. The mission-critical system, for example, has cluster system architecture, and failover is executed when a fault occurs in a server or the like. The term “failover” refers to a function by which a standby server takes over processes and data instead of a working server, for example, when a fault occurs in the working server.
In cluster systems, in order to achieve data integrity and task-service continuity, it is important that only one working server perform processing in any situation, and there are demands for a scheme for ensuring that two or more servers do not operate as working servers. Two or more servers operating as working servers may hereinafter be referred to as a “double active operation”.
Heretofore, a cluster system using power-supply control devices has been available as a technology for inhibiting the double active operation. The power-supply control devices are apparatuses having a dedicated function for starting up and shutting down servers. In the cluster system using the power-supply control devices, during switching of the working server, a switching-target server uses the power-supply control device to stop the power supply of a switching-source server. Upon detecting the stopping of the power supply of the switching-source server, the switching-target server is switched to a working server to thereby execute failover, while inhibiting the double active operation. The switching-target server is a server that operates as a working server after execution of failover. The switching-source server is a server that has been operating as a working server before execution of failover.
An example of a related technology is a technology in which a failed node notifies a service processor about the occurrence of a failure or transmits failure information to another node in the same partition to thereby perform processing for the failure. There is also a technology in which, when a server that is operating as a standby system detects a fault in a server that is operating as a working server, a request for blocking communication to/from communication equipment connected to the faulty server is issued to thereby disconnect the faulty server from a network. Examples of related technologies are disclosed in Japanese Laid-open Patent Publication No. 2004-62535 and Japanese Laid-open Patent Publication No. 2007-233586.