1. Field of the Invention
This invention relates to an information processing technology using a computer, and in particular to information processing in cases in which an anomaly occurs due to, for example, a specific fault.
2. Description of the Related Art
Computer systems are known which have, for example, a host device (for example, a host computer), and first and second storage device systems (for example, disk array systems such as RAID (Redundant Array of Independent Disks)). Each of the first and second storage device systems comprises at least one logical volume. One logical volume is prepared for one or a plurality of physical storage devices (for example, hard disks) comprised by the storage device system.
In such a computer system, for example, remote copy processing may be performed. In remote copying, the data in a logical volume of a first storage device system is copied to a logical volume in a second storage device system, without passing through a host device. The logical volume which is the copying source of the remote copying is called the copy source volume, and the logical volume which is the copying target of the remote copying is called the copy target volume. The copy source volume and copy target volume may, for example, have the same storage capacity and form a one-to-one relationship (in others, form a copy pair). The data in the copy source volume is copied to the copy target volume via a remote copy line (for example, a dedicated circuit, public circuit, or similar) connecting the first and second storage device systems. In remote copying, the copying direction is for example unidirectional, and in the even of write requests from a host device, the copy source volume can accept a request, but the copy target volume cannot accept a request. When data contained in the copy source volume is updated (for example, when a second data item is overwritten by a first data item), the update data (for example, the difference between the first and second data items) is written to the copy target volume from the copy source volume via a remote copy line, and by this means the data in the copy source volume and the data in the copy target volume are made the same. Technology related to such remote copying is disclosed in for example Japanese Patent Laid-open No. 2003-76592 and U.S. Pat. No. 5,742,792.
A computer system may comprise a plurality of host devices, such as for example first and second host devices. In such a computer system, the same logical volume may be shared by the first and second host devices (below, such logical volumes are called “shared volumes”). A shared volume is exclusively controlled. Specifically, control is executed such that access requests for a shared volume are permitted only from the first host device, and access requests from the second host device in the same time period are not permitted. More specifically, in a computer system which for example adopts SCSI (Small Computer System Interface) as the interface between host devices and storage device systems, when a first host device sends to a shared volume a reserve-system command defined by the SCSI protocol, and when the shared volume is not being used by any host device, the storage device system, upon receiving the above reserve-system command from the first host device, puts the shared volume into the reserved state with respect to the first host device, and by this means can ensure that access requests from a second host device are not accepted. If, while the shared volume is in the reserved state with respect to the first host device, a request to access the shared volume is received from the second host device, the storage device system returns to the second host device status data (for example, data indicating the reservation conflict status) indicating that the shared volume has been reserved by another host device.
A host device comprises, for example, application software (henceforth called an “application”) and driver software for the storage device system (henceforth called “disk control software”). An application can issue I/O requests for writing of data to a logical volume or for reading of data from a logical volume, according to user operations or other conditions. Disk control software can receive an I/O request issued by an application, convert the I/O request into a format which can be processed by the storage device system (for example, a format based on the SCSI protocol), and send the converted I/O request to the storage device system. Also, disk control software may for example receive data indicating an anomaly status (henceforth called “anomaly status data”) from the storage device system as a response to an I/O request. When anomaly status data which has been received indicates a specific anomaly status, the disk control software can execute rewrite processing, such as for example processing to again send to the storage device system a converted I/O request which has been sent in the past, as described above.
However, a plurality of host devices can be connected to configure a cluster. In this case, each of the host devices (hereafter, for convenience, called “cluster servers”) comprised by the cluster is equipped with, for example, software (hereafter “cluster software”) to realize the cluster. Below, for convenience, resources managed by a cluster (for example, physical storage devices and other hardware, as well as database management system and other software) are called “cluster resources”. A computer system comprising a cluster is called a “cluster system”.
By performing what is called fail-over processing, a cluster system can continue usage of cluster resources. Specifically, when for example use of a certain cluster resource by a certain cluster server cannot be continued due to the occurrence of a fault in the cluster server, the cluster software within the cluster server performs fail-over processing, that is, performs processing to switch use of the above cluster resource to another cluster server which is operating normally, so that use of the cluster resource can be continued. The plurality of cluster servers comprised by the cluster system are connected by a network using the Internet protocol (IP) or similar. The cluster software in each of the cluster servers, by communicating with other cluster servers over this network, monitors the states of the communicating cluster servers. This communication is called “cluster communication” or “heartbeat communication”.
A cluster system in which a plurality of cluster servers share a single storage device system is called, for example, a shared disk model cluster system. In a shared disk model cluster system, when for example the heartbeat communication between two cluster servers is cut off, each of the two cluster servers can confirm the operating state of the other cluster server through shared exclusive control using a shared volume, and by this means it is possible to prevent a state (hereafter called a “split-brain” state) in which the two cluster servers operate separately. Below, for convenience, control performed to prevent such a split-brain state (in the above example, shared exclusive control) is called “arbitration”.
Cluster software for realization of shared disk model cluster systems comprises software to, for example, perform shared exclusive control (that is, requesting that the disk control software issue reserve-system commands) for storage disks used to perform arbitration (called, for example, arbitration disks, arbitration volumes, or quorum disks) using SCSI commands, by this means avoiding a split-brain state. For example, cluster software can periodically issue I/O requests to a storage device system via disk control software, reference response results received via the disk control software from the storage device system in response, and monitor the state of the storage device system receiving the I/O requests. When a response request is an anomaly status, the cluster software judges whether a fault has occurred, and can execute the above-described fail-over processing. Cluster software has been disclosed in for example U.S. patent application Ser. No. 6,279,032 and U.S. patent application Ser. No. 6,401,120.