The present invention relates to a node switching method and an information processing system. More particularly, it relates to an I/O node-switching method and an information processing system in the following case: At the time of a failure occurrence, a processing, which is being executed by a host where the failure has occurred, is taken over to another host so as to allow another host to continue this processing.
Generally speaking, starting with a financial system and a securities system, systems which form and support the social infrastructure are requested to exhibit a high reliability. Namely, these systems are not permitted to fall into the service interruption, i.e., the system down. On account of this requirement, these systems are configured as follows: Namely, devices which configure these systems, such as hosts and paths for connecting the hosts with a disk device, are formed into a dual-redundant structure. For example, even if a failure has occurred in an execution-node host, the processing is immediately switched to a standby-node host, thereby preventing the entire system from falling down for a long time. The switching operation like this is referred to as “node switching”.
As described above, in the dual-redundant system, if a failure has occurred in an execution-node host and if the node switching to a standby-node host has been performed, the standby-node host becomes a new execution-node host. During the node switching and after the node switching, however, an I/O access control needs to be performed so that an I/O for the disk device from the host that had previously been the execution node will be cut off, and so that the cut-off of an I/O therefor from the host that had previously been the standby node will be released. This control is needed in order to prevent a data crash caused by a case where the shared disk is accessed simultaneously by both of the nodes, i.e., the host that has newly become the execution node and the host that had previously been the execution node.
As methods for performing the I/O access control as described above, there exists a method performed at the host side and a one performed at the disk-device side.
As the method of performing the node switching at the host side as a result of a failure of the OS itself or that of the node switching mechanism, there has been known a technology disclosed in JP-A-6-325008. This conventional technology is as follows: A standby-node host, which has detected the failure, performs a reset operation for an execution-node host so as to interrupt the I/O of the execution-node host, then performing the node switching.
The method of performing the I/O access control at the disk-device side is as follows: The I/O access control is performed with respect to plural paths, using the definition of a PERSISTENT RESERVE Command which exists in ANSI Standard SPC (i.e., SCSI-3 Primary Command). Also, I/Os from a certain path are cut off, thereby canceling all the I/Os from the path which are in processing. This method, which performs a logical-disk exclusion control on each host basis by using the PERSISTENT RESERVE, has been disclosed and known in JP-A-2000-322369. Moreover, a path which uses the PERSISTENT RESERVE, at first, registers the Reservation Key into logical disks. In this case, two ways of methods are prepared for the access control to the logical disks.
The first access-control method is a one where only an access from a path that has applied a Reservation is permitted regardless of the presence or absence of the Reservation Key's registration. The second access-control method is a one where accesses from all the paths that have registered the Reservation Keys are permitted if a Reservation has been applied from a certain path. Cutting off the access from a specific host and path necessitates the specification of the Reservation Key to be cut off.
Consequently, the access control to the logical disks by the above-described first method is performed in accordance with the following steps: An execution-node host and a standby-node host have performed in advance the registration of the Reservation Keys, and the execution-node host applies the Reservation. At the time of a failure occurrence, the standby-node host specifies the Reservation Key of the execution-node host, thereby performing the cut-off operation. After that, the standby-node host applies the Reservation.
Also, the access control to the logical disks by the above-described second method is performed in accordance with the following steps: Only the execution-node host performs the registration of the Reservation Keys for all the paths from the execution-node host to the disk device. Meanwhile, the standby-node host performs no registration of the Reservation Key. At the time of a failure occurrence, the standby-node host performs the registration of the Reservation Keys for all the paths from the standby-node host to the disk device. Next, the standby-node host specifies all the Reservation Keys of the execution-node host, thereby performing the cut-off operations. After that, the standby-node host applies the Reservation.