The present invention relates to data synchronization control in asynchronous remote copy between disk control devices.
In computer systems of recent years, basic business of banks and stock companies has been shifted from centralized management using a large sized computer to a distributed database system (DDBMS) around a client/server system. In such a distributed database system environment, there is adopted an HA (High Availability) cluster configuration for processing data by using a plurality of servers and disk array devices in response to a client request. In such an HA cluster configuration, there is adopted a method of duplicating data between data centers located in remote places by way of precaution against a disaster such as an earthquake. As for duplication, there is typically adopted a method of effecting remote connection of two disk array devices (storage systems) via a public line or a private line, and copying write data to be fed from a host computer to a local disk array device to a remote disk array device.
Methods for effecting duplication between disk array devices are broadly classified into a synchronous system and an asynchronous system.
In the synchronous system, a write request from a host device of local side is first written into a cache of a disk array device of the local side. Subsequently, the disk array device of the local side transfers data written into the cache to a remote disk array device. Upon receiving an acknowledgement signal indicating that write request data has been received by a disk array device of remote side, the disk array device of the local side thereafter returns an acknowledgement about the write request to the host. In other words, the disk array device of the local side makes sure that data has been copied to the disk array device of the remote side, and returns an acknowledgement to the host. This acknowledgement ensures the host that data has been delivered to the remote side. (The term xe2x80x9csynchronousxe2x80x9d is used in the meaning that a write request from the host and copying onto the disk array device of the remote side are conducted in synchronism.) Since a delay is caused by waiting for an acknowledgement signal from the remote side, this synchronous system is suitable for a relatively short distance (of at most 100 km) causing a short propagation delay of data transmission between the local side and the remote side. However, the synchronous system is not suitable for long distance transfer using, for example, a public line network or the like. By the way, data recorded in the disk array devices of the local side and the remote side are written onto respective physical disks via respective drive control circuits.
On the other hand, the asynchronous system is suited for long distance transfer. As for an acknowledgement (write completion) to a host device of the local side about a write request issued by the host device of the local side, the write completion is returned to the host device at the time point when data of the write request has been written into a cache of the local side. After the acknowledgement to the host device, the data written into the cache is copied (transferred) to a disk array device of the remote side at different timing (in this meaning, this system is asynchronous). In this asynchronous system, an acknowledgement about the above described write request is returned to the host device irrespective of timing of transferring data to the disk array device of the remote side. Therefore, the acknowledgement timing is made earlier as compared with the synchronous system, and the host can shift to the next processing earlier.
By the way, as for the remote copy, it is described in Tokuhyo-Hei-8-509565.
In the above described method of transferring data to the remote side asynchronously, the disk array device of the local side reports the write request completion to the host at the time point when the data has been stored in the disk array device of the local side, irrespective of whether data has been stored on the remote side or not. Therefore, it is difficult for the host of the local side to confirm the completion of synchronization of the host write request to the remote side (i.e., determine whether data generated by the host write request has been transferred positively to the disk array device of the remote side). This confirmation of synchronization completion of the host write request on the remote side is needed especially in commit (assurance that data has been stored in a storage positively) in a history log file or the like of a database with a transaction of the database taken as the unit. By the way, the commit means a series of processing of writing update results of a plurality of databases concerning one transaction into an actual storage system together with a log file.
Furthermore, from the viewpoint of data recovery at the time of a disaster, there is a problem that data which are left in the disk array device and which have not been transferred yet are lost because of a fault of a primary site (main site), and it is impossible to know assured data after takeover to a secondary site (back-up site) is conducted and operation is started in the secondary site.
However, the asynchronous transfer system of the above described conventional technique does not have a synchronization confirmation method for a host I/O because of inherent characteristics of asynchronous transfer. In other words, there is not provided a method of determining whether a write request at a commit point for a transaction fed from an APP (application program) has been positively written into a remote site (secondary site), which is needed for operation of a database (DB).
Hereafter, problems will be described concretely. First, the case where a computer is connected to one storage system will be described. Thereafter, the problems will be described definitely as to the case where the storage system is conducting asynchronous data transfer (asynchronous remote copy).
First, the case where one storage system is connected to a computer will now be described. If an application of the computer executes a write command (request), then typically data of the write command is simply written into a data buffer included in the computer, in such a state that there is not a commit command. Data in the data buffer does not coincide with data in the storage system. If thereafter the application issues a commit command, then the data in the data buffer is actually written in the storage system by a write command. Thereafter, the storage system stores write data in a cache memory. (At this time point, the data in the storage system coincides with the data in the computer.) Thereupon, the storage system acknowledgements the computer which has issued the write command, with write request completion. Upon confirming the write request completion, the computer returns an acknowledgement for the commit command to the application. By means of this return, the application knows that the data in the storage system coincides with the data in the computer.
Subsequently, the case where asynchronous remote copy is being conducted will now be described. If an application of the computer issues a commit command, then data in the data buffer is written into a cache of the storage system of the local side by a write command. As its acknowledgement, the storage system of the local side returns write completion to the computer. Upon receiving the write completion, the computer returns an acknowledgement for a commit to the application. However, this return merely indicates that the data of the storage system of the local side coincides with the data in the data buffer. If the application attempts to continue the processing by using the data of the remote side, in the case where the data in the storage system of the local side disappears after the storage system of the local side returns write request completion and before copying data to the remote side is finished, therefore, processing is continued by using erroneous data, although a commit return is received and it is made sure that data is determined in the storage system. In other words, if a trouble or the like has occurred during asynchronous remote copy, then in some cases the computer application cannot obtain a satisfactory result by means of the conventional commit function.
In this way, the asynchronous remote copy of the conventional technique does not have a data synchronization confirmation method for the host I/O because of characteristics of the asynchronous transfer. As a result, the asynchronous transfer system of the conventional technique has a problem that it cannot be determined whether a write request at a commit point for a transaction fed from an APP (application program) has been positively written into a remote site (secondary site), which is needed for operation of a database (DB).
An object of the present invention is to make possible confirmation of the data synchronization to the remote site concerning the host I/O (write request) at an arbitrary time point or by taking a commit of the host application as the unit.
The above described object can be realized by a storage system of a local side connected to an upper device of the local side and a storage system of a remote side and used. The storage system of the local side includes means for receiving a write command from the upper device, means for transmitting data contained in the write command to the storage system of the remote side, means for receiving a query command for inquiring whether the storage system of the remote side has received the data, from the upper device, and means for transmitting an acknowledgement to the query command, to the upper device.
Furthermore, the above described object can be realized by a storage system communicating with an upper device. The storage system includes a first interface circuit supplied with a write command from the upper device, a second interface circuit for outputting data contained in the write command and information identifying the data to a different storage system, the first interface circuit supplied with a query command concerning the data from the upper device, the first interface circuit for outputting information identifying the data outputted together with the data, before transmission from the second interface, to the upper device, the second interface circuit supplied with the information inputted to the different storage system together with the data inputted to the different storage system, and the first interface circuit for outputting the information inputted to the second interface circuit, to the upper device.