The present invention relates to a large area data storage system wherein an external storage device can quickly recover from a blockage that occurs due to a disaster, and in particular, to a large area data storage system wherein three or more external storage devices located at distances of one hundred to several hundred kms perform complementary operations.
Disclosed in JP11338647, by the present inventor, is a method whereby doubling of a system or data is performed synchronously or asynchronously. Further, disclosed in JP2000305856, by the present inventor, is a technique for asynchronously copying data to a remote area.
As is described above, the present inventor has proposed asynchronous remote copy techniques whereby an external storage device (hereinafter referred to as a storage sub-system), without receiving special control information specifying data order, receives data from a large computer system, a server or a personal computer connected to a network, or another higher computer system (hereinafter referred to as a host), and employs asynchronous transmission to continuously write data to a remotely situated second storage sub-system, while constantly maintaining the order of the data.
Further, when data is to be copied using the synchronous transmission technique, the performance of the data update process between a host and a storage sub-system connected thereto interacts with the exercise of the copy control process between the storage sub-system and a second storage sub-system located in the vicinity or in a remote area. Therefore, macroscopically, data exchanged by the two storage sub-systems are constantly being matched, and the order in which the data are written is also obtained. When an appropriate data transfer path is selected, the copy process effected through the synchronous transfer of data can be performed even when the distance between the two storage sub-systems exceeds 100 km.
Recently, awareness has grown of how important are the safe storage and the maintenance of data, giving rise to the expression of many demands, originating in the data storage market, for viable disaster recovery systems. Conventional means devised to satisfy these demands generally provide for the synchronous and asynchronous transfer of data between two connected data storage points. However, further market sourced requests call for the inclusion of third and fourth data storage points (hereinafter referred to as data centers), and for the construction of comprehensive, or near comprehensive, disaster recovery systems to service these data centers.
The reasoning behind these requests is that so long as three or more data centers are established, even if a disaster strikes one of the data centers, the redundancy represented by the storage and maintenance of data at the remaining data centers will enable data to be recovered and will reduce the risk represented by the occurrence of a succeeding disaster.
According to the conventional technique, adequate consideration is not given for a case wherein three or more data centers have been established and I/O data is received from a host having a logical volume of only one storage sub-system, and the remote copy technique is used for transmissions to multiple data centers. For example, for an event wherein a data center is disabled by a disaster, little consideration is given as to whether a logical volume that guarantees data order can be maintained between two or more remaining data centers, whether the update state can be maintained and non-matching data can be removed, and whether a system that can copy data relative to a vicinity and a remote area can be re-constructed.
Since when a disaster will occur is an unknown, among a grouping of three or more data centers the order in which data is updated must be constantly maintained.
Therefore, a large area data storage system must be constructed wherein a specific function is not uniquely provided for a host and a plurality of remote copying systems are coupled together, wherein received data having the same logical volume is distributed to another storage sub-system situated at a nearby or a remote location, and wherein the storage sub-systems of data centers constantly guarantee the order in which data received from the host are updated.
To resolve the above problem, according to the invention, a large area data storage system copies data to another storage sub-system without providing a redundant logical volume for a storage sub-system.
Further, according to the present invention, the reconstruction of a large area storage system is assumed to be the recovery operation objective following a disaster. During normal operation, management information is directly exchanged by storage sub-systems that do not perform data transfer functions, and the data update state is monitored and controlled by each storage sub-system. Then, during a recovery operation (re-synchronization, or resync) following a disaster, only the difference between data stored in the storage sub-systems transmitted immediately before the disaster occurs, and the exchange of hosts (fail over) and the continuation of the application are performed immediately.
<To Constantly Guarantee the Order for Updating Data>
A supplementary explanation will now be given for the time range for holding a data order.
The I/O data issued by the host is written to the storage sub-system, and the host receives a data-write-complete notification from the storage sub-system before performing the next step. When the host does not receive a data-write-complete notification from the storage sub-system, or receives a blockage notification, the host does not normally issue the next I/O data. Therefore, the data writing order should be maintained when the storage sub-system performs a specific order holding process before and after it transmits a write-end notification to the host.
In the remote copy process performed by the synchronous transfer of data, the data to be transmitted and copied is written to a storage sub-system situated nearby or at a remote location (hereinafter referred to simply as a different location), and when a write-end notification is received from the storage sub-system at the different location, the write-end notification is reported to the host. Compared with when a remote copy process is not performed, remote copy time and data transfer time are increased, and the performance is delayed. When the connection distance for a remote copy process is extended, the processing time for the data transfer is increased, and the remote copy process causes the performance of the I/O process to be further deteriorated. One of the methods used to resolve this problem is the asynchronous transfer of data.
During the asynchronous transfer of data, upon receiving I/O data from the host, the storage sub-system transmits data to a storage sub-system at a different location, and returns a write-end notification to the host without waiting for the write-end notification from the storage sub-system at the different location. Thus, the transmission of data between the storage sub-systems is not associated with the I/O process performed by the host, and can asynchronously be performed with the I/O process of the host. However, unless the data is written to the storage sub-system in a different location in the order whereat the data was received from the host, the data order may not be maintained by the storage sub-system at the different location, and data non-matching may occur between the two storage sub-systems. The additional provision of a function that constantly guarantees the data order, is the best possible means by which to reduce occurrences of this problem.
Compared with the storage sub-system that has received the host I/O data, the updating of data in the storage sub-system at a different location is generally delayed. However, so long as the data is written to the storage sub-system following the order in which the data arrived from the host, there is no divergence in the data order, and the recovery from a blockage can be performed by a journal file system or a database recovery process.
There is another method by which, without maintaining data order, the remote copying of the data order to a storage sub-system at a different location and the reflection of the data can be performed. According to this method, data from the host that have been received up to a specific time are transmitted to a different location and are collectively written to the storage sub-system. When the data received up to a specific time have been written, the data transfer process is terminated, and thereafter, data transfer by remote copying is halted until collective writing is next performed, and while data transfer is halted, the data order and the consistency of the I/O data received from the host is guaranteed.
According to this method, the function for providing the data order information is not required. A specific amount of data to be updated is stored and is collectively transmitted, and when the writing of data to a remote side has been completed, the data matching is guaranteed. According to this method, however, when a blockage occurs during remote copying, the data is not updated while the data updating order on the remote side is maintained, so that all the data are lost. Only during a period in which the data transfer by remote copying is halted can the data matching be guaranteed and be called adaptive.
The technique of the present inventor of the “remote copying by the asynchronous transfer of data for constantly guaranteeing the data order” includes a feature that, before returning an end notification to the host, the storage sub-system performs a process for guaranteeing the data order. Since regardless of the overheard in the controller of the storage sub-system, or the delay time for the internal process, management is provided for the data order information for each block before returning the end notification to the host, the data order can be consistently guaranteed.
Actually, the data order information is managed or controlled for each block during a time considerably shorter than the interval whereat the host issues the I/O. The time out (Timeout) value for the distribution of data to the storage sub-system at the remote location is set for at least one hour. The importance of this is that the remote copy technique of the present invention transmits data, together with order information, to a data block and writes the data in order in accordance with the order information. This is possible, so long as the order is correct, because even when between the local and remote systems the time lag for the updating of data is half a day, for example, this is much better than when, due to the non-matching of data, all the updated data are lost.