Computer storage disaster recovery systems typically address two types of failures: a sudden catastrophic failure at a single point in time, or data loss over a period of time. In the second type of gradual failure, updates to data volumes may be lost. To assist in recovery of data updates, a copy of data may be provided at a remote location. The assignee of the subject patent application, International Business Machines Corporation (IBM®), presently provides two systems for maintaining remote copies of data at secondary, tertiary or other backup storage device; extended remote copy (XRC) and peer to peer remote copy (PPRC). These systems provide a method for recovering data updates between a last, safe backup and a system failure. Such data shadowing systems can also provide an additional remote copy for nonrecovery purposes, such as local access at a remote site. In typical backup systems, data is maintained in volume pairs. A volume pair is comprised of a volume in a primary storage device and a corresponding volume in a secondary storage device that includes an identical copy of the data maintained in the primary volume. Typically, the primary and secondary storage volumes will be maintained in a direct access storage device (DASD). Primary and secondary storage controllers are typically provided to control access to the respective DASDs.
In shadowing or remote copy systems such as described above, a device or computer receives inbound transfers of data updates and subsequently transfers the updates outbound to another computer or device. For example, a primary storage controller may send transfers to a primary backup appliance which in turn offloads the transferred data updates to a secondary backup appliance associated with a secondary storage controller at a remote site. It is possible that the transfers are accomplished either synchronously with the initial write of data by the primary storage controller or asynchronously. In an asynchronous system, a backup appliance which is typically a secondary backup appliance receives transfers from the primary storage controller either directy or through intermediate appliances or devices and saves each transfer in a memory device to which it has access. The backup appliance then typically confirms acceptance of the transfer of data with the primary controller. Subsequently, the backup appliance is responsible for offloading this data to the secondary storage controller. A problem exists with asynchronous solutions which can expose the data processing system to possible error scenarios where the primary and secondary storage controllers are not consistent. For example, if data transfers are sent from the backup appliance to the secondary controller in an order that is different from the order they were received from the primary, and a failure occurs during the transfer of data, the secondary controller will not receive the data in the proper order and it is no longer guaranteed that the primary and secondary controllers will contain the same data.
Partial solutions to this problem exist which typically involve the formation of a consistent set of transactions between the primary and secondary controllers at the backup appliance. The backup appliance forms these data sets, or consistency groups, and then offloads them to the secondary controller in a regulated manner. When a data set is offloaded to the secondary, the secondary is then in a point-in-time consistent state with the primary controller. Only after completion of a given set of transfers does the backup appliance begin to offload a second set of transfers.
The formation and regulated transfer of consistent data sets or consistency groups alone is still not sufficient, however, to guarantee that the primary and secondary controllers are always consistent. It is possible that a failure will occur during the transfer of updates to the secondary controller which will result in an inconsistent state because some, but not all, of the transfers in the consistent set have been updated. This situation may be remedied through a technique known as “read-before-write”. When read-before-write is implemented, the backup appliance will have received a set of transactions that need to be written to the secondary controller to bring the secondary into consistency with the primary controller. Before writing the new transactions, however, the backup appliance will read the data that is to be updated from the secondary storage and save a current version of it to a storage disk or other memory to which it has access. Only after the current data is stored at the backup appliance will the updated data be written to the secondary controller. Using this method, if a failure occurs while writing the set to the secondary controller, the backup appliance can rewrite the saved version of the data back to the secondary and effectively put the secondary back in its previous consistent state.
The read-before-write technique may allow some data to be lost, but the primary and secondary controllers will still be consistent at the prior point in time, thus it will not require a complete copy of the entire primary controller's data to put the secondary back into a full and current consistent state with the primary.
Typically, read-before-write data set may persist for only a very short period of time depending on the amount of storage allocated for them and the size of each set. Thus, a read-before-write data set will not allow the backup appliance to restore the secondary controller to a previous state that existed prior to the last remaining read-before-write transaction. Also, independent access to previous data storage states for nonrecovery purposes is typically unavailable as the backup appliance must apply the previous states sequentially to restore the secondary controller to the desired state. Thus, the backup appliance can only, when implementing read-before-write techniques, move backward and forward within a group of consistent sets, allowing restoration of the secondary controller to a selected saved state and then updating to any previous or future saved states that exist within the limited set of standard saved states. It is desirable to maintain multiple point-in-time consistent data sets that persist for a longer period of time than the standard read-before-write sets, and that are not subject to the requirement of having to apply each previous read-before-write set in order to successfully apply the desired set. Thus, there is a need in the art for an improved method and apparatus for storing and retrieving multiple point-in-time consistent data sets.