The present invention relates to a data center system comprising a plurality of data centers, and more particularly to failover/failback control that is exercised when host computers in a cluster configuration are connected to each data center.
Computers have begun to retain valuable information as the society has been increasingly IT-driven in recent years. If, for instance, a natural calamity happens unexpectedly, it is extremely important that data be safety saved and retained. Under such a circumstance, it is essential to provide storage system/data redundancy and establish proper means for storage system/data recovery.
Meanwhile, a cluster service can be used as a means for providing system redundancy. A cluster is a system in which a standby computer is furnished in addition to a main computer to provide against a failure in a computer so that even if the main computer should stop running, processing can be transferred to the standby computer to continuously perform the current operation without shutting down the computers. Further, when the main computer stops running and processing is transferred to a standby computer, the standby computer is allowed to recognize a disk volume that has been recognized by the main computer. Because of these features, the cluster service is incorporated into important systems as a technology.
A technology available for data redundancy retains a copy of data among a plurality of storage systems connected to a host computer. A technology for allowing storage systems that may be positioned physically far from each other to exchange data is called a remote copy. A certain remote copy technology is also proposed for permitting a plurality of storage systems to mutually copy data without a host computer. When the above remote copy technology is used in conjunction with a cluster configuration technology, an increased degree of system/data redundancy can be provided.
Provision of increased degree of system/data redundancy will now be described with reference to an example in which a storage system is connected to each of two host computers while one of the host computers is designated as a standby computer with the other designated as an active computer to form a cluster. If the storage system connected to an active host computer performs a remote copy to the other storage system connected to the remaining host computer on standby, setup is performed so that a volume on the active storage system (remote copy source) can be recognized by the active host computer to be connected to active storage system and that a volume on the standby storage system (remote copy destination) can be recognized by the standby host computer to be connected to the standby storage system. If a failure occurs in the active host computer with the system described above, the cluster service transfers processing to the standby host computer so that the standby host computer can recognize the data in the storage system at the remote copy destination.
Further, the storage system to be connected to the host computer that has received processing can be set as a remote copy source with the storage system at the remote copy source set as a remote copy destination. Even if a failure occurs in a host computer, the storage system's remote copy direction can be changed (by interchanging the copy source and copy destination) as described above so that remote copy operations can be continuously performed without halting the overall system operation.
Two remote copy methods are available: synchronous transfer method and asynchronous transfer method. FIG. 3 illustrates how the remote copy process is performed. For explanation purposes, the computers constituting the individual systems are designated by node A, node B, node C, and node D.
When, in a remote copy operation 1200 based on the synchronous transfer method, storage system A 1020 receives a write instruction for data from node A 1010 ((1)), it issues a write instruction for the same data to storage system B 1021 ((2)). When the data is completely written into storage system B 1021, a completion notification is transmitted to storage system A 1020 ((3)), and a write completion notification is issued to node A 1010 ((4)). In this instance, an update is performed while the data retained by storage system A 1020, which is connected to node A, is kept identical with the data retained by storage system B 1021. This manner of remote copy operation is referred to as a synchronous remote copy operation. On the other hand, when, in a remote copy operation 1201 based on the asynchronous transfer method, storage system C 1022 receives a write instruction for data from node C 1013, it issues a write completion instruction for the same data to node C 1013 ((2)). Storage system C 1022 issues a write instruction to storage system D 1023 asynchronously relative to a process requested by node C 1013, and receives a write completion notification ((4)).
The difference between the two methods will now be described. When performing a remote copy operation 1200 based on the synchronous transfer method, storage system A copies the data written in storage system A 1020 at a remote copy source to storage system B 1021 at a remote copy destination synchronously relative to a write instruction of node A 1010, which is a host computer. Therefore, the storage systems usually retain the same data. When performing a remote copy operation 1201 based on the asynchronous transfer method, a storage system copies the data written in storage system C 1022 at a remote copy source to storage system D 1023 at a remote copy destination asynchronously relative to a write instruction from node C 1013, which is a host computer. In other words, storage system C 1022 transfers data designated by a write request from node C 1013 to storage system D 1023, which is a remote copy destination, after issuing a notification of completion of a data write to node C 1013. The above data transfer operation is performed according to a task schedule unique to storage system C 1022. Therefore, storage system D 1023 at the remote copy destination retains old data for a longer period of time than the remote copy source. However, a data write completion notification is transmitted to node C 1013 without waiting for the process for data transfer to storage system D 1023 at the remote copy destination. As a result, node C 1013 can immediately proceed to the next process (see, e.g., U.S. Pat. No. 5,554,347).