This invention relates to a computer system that includes a storage system, and more specifically, to a remote copying system in which data is copied between two or more storage systems.
In a computer system having a storage system, a failure in the storage system upon a disaster such as power trouble or natural calamity may stop a business that uses the computer system, and in the worst case, may cause loss of data stored in the storage system. One of techniques that can be used to avoid such a situation is remote copying with which data stored in a storage system on a primary site (the location where a computer system is set up) is transferred to a remote site (secondary site) placed at a great distance from the primary site to hold a storage system, and is copied to be stored in the storage system on the secondary site.
Remote copying is a technique of transferring write data, that is written in a storage system of the primary site from a host computer included in the computer system of the primary site, to a storage system of the secondary site to store the data in the storage system of the secondary site. There are two types of remote copying; one is synchronous remote copying and the other is asynchronous remote copying.
In synchronous remote copying, when a host sends a write command to a storage system of the primary site (primary storage system), the primary storage system first transfers write data to a storage system of the secondary site (secondary storage system) and then sends a write completion report in response to the write command to the host of the primary site.
In asynchronous remote copying, when the host sends a write command to the primary storage system, the primary storage system first sends a write completion report in response to the write command to the host, and then transfers the write data to the secondary storage system.
A logical disk set in the primary storage system is hereinafter referred to as a primary logical disk. A logical disk set in the secondary storage system to store a copy of data in the primary logical disk through remote copying is hereinafter referred to as secondary logical disk.
When a failure occurs in the computer system on the primary site, the host on the secondary site uses data stored in the secondary logical disk of the secondary storage system to resume processing by an application program. In order for the host of the secondary site to use the secondary logical disk to resume processing by an application program, it is indispensable that the secondary logical disk keeps consistency. The consistency here is a concept regarding the order of data written to a logical disk, and is considered to be achieved when the following two conditions are fulfilled:
(1) In the case where a host writes first data A and next data B to a logical disk while keeping their order intact, the host first writes the data A to a storage system and then has to wait to receive a report of completion of wiring the data A from the storage system before writing the data B to the storage system.
(2) In the case where the condition (1) is satisfied, a part or the entirety of the data B exists on the logical disk only when the entirety of the data A is on the logical disk.
In asynchronous remote copying, the primary storage system gives order information (for example sequential number) to write data received from a host and transfers the write data with the sequential number attached to the secondary storage system. The secondary storage system stores data in the secondary logical disk in an order dictated by the sequential number, to thereby maintain the consistency in the secondary logical disk. Another conceivable method to maintain the consistency in the secondary logical disks in asynchronous remote copying is to group data that are written to the primary logical disk within a given period into one, and have the primary storage system atomically transfer the group of data to the secondary storage system, where the group of data is copied to the secondary logical disk. Still another way is to employ a method disclosed in U.S. Pat. No. 6,408,370 B.
Synchronous remote copying, on the other hand, finds it difficult to maintain data consistency between plural pairs of primary logical disks and secondary logical disks when the pairs are allowed to control suspend and start(or restart) copying individually.
For instance, consider a case where a pair A and a pair B are set across the primary site and the secondary site, and remote copying for the pair A is suspended because of a communication error between the primary logical disk and the secondary logical disk. Then assume that the data B is written to the primary logical disk of the pair B after the data A is written to the primary logical disk of the pair A by a host of the primary site. Since remote copying is suspended for the pair A due to the communication error, the primary storage system having the primary logical disk of the pair A sends, upon receiving the data A from the host of the primary site, a write completion report to the host of the primary site without sending the data A to the secondary storage system. Receiving the completion report, the host of the primary site then writes the data B to the primary storage system having the primary logical disk of the pair B. Since remote copying is available for the pair B, the primary storage system sends the received data B to the secondary storage system and then sends a write completion report to the host of the primary site. As a result, the secondary logical disk of the pair B stores the data B whereas the secondary logical disk of the pair A does not store the data A, disrupting the consistency between the secondary logical disks of the pair A and the pair B.
U.S. Pat. No. 5,692,155B discloses a method to solve this problem. According to this method, the consistency between secondary logical disks is ensured by the following steps:
(1) A host instructs plural pairs that pause processing of writing data from the host to the primary logical disks of each pair.
(2) Upon receiving the instruction from the host, the primary storage system executes writing of data that is received from the host prior to the instruction, while sending a busy signal in response to a write request that is received from the host after the instruction.
(3) The host further instructs every pairs to suspend remote copying. Upon receiving this instruction, the primary storage system suspends remote copy processing.