It is important to provide either a local and/or a remote contemporary asynchronous copying capability of data for real-time backup protection of data stored in a data processing installation, such as in peripheral data storage. Backing up or copying to a remote data center provides for physical disasters not protected by a local back up copy, even to an independent data-storage system. This automatic data copying is referred to as remote duplexing or remote data copying. It is important to have this type of automatic service for disaster data backup. Most users want this automatic backup capability to be direct access storage device (DASD) storage based and application independent and to be independent of application program execution.
Such data preservation uses either synchronous or asynchronous copying. Synchronous copying requires that both the primary and secondary copies of data be retentively stored before an indication is given to the writing host processor that the data are retentively stored. Asynchronous copying merely requires that the data are retentively stored in a primary systems data storage. The back up storage proceeds independently of the primary data storage completion. Usually such back up is timely completed using contemporary asychronous remote data copying. Such remote copying is of special significance where it is anticipated that data copied and stored at a remote site would be the repository for any continued interaction with the data should the work and data of a primary site become unavailable. The factors of interest in copying include the protection domain (system and/or environmental failure or device and/or media failure), data loss (no loss/partial loss), time where copying occurs as related to the occurrence of other data and processes (point in time/real time), the degree of disruption to applications executing on said computer, and whether the copy is application or storage subsystem based. With regard to the last factor, application based copying involves log files, data files, program routines while storage based copying involves an understanding of DASD addresses and data set identifiers. Another factor is minimal interference of such remote data copying with usual day-to-day data processing operations. It is therefore desired to provide an efficient apparatus and method for remote data copying meeting the above-stated needs.
Copying data from a primary system to a remote secondary system asynchronously and independently from primary site data processing operations involves an appreciation of how write update operations are generated at a primary system. In this regard, a primary site includes one or more applications concurrently executing on a processor in which each application generates what is termed "application dependent writes". That is, the storage subsystem has no knowledge or awareness of the write operations or their queued scheduling to be invoked or called via the operating system. Varying delay is endemic. Applications do not write merely to a single or the same DASD. Indeed, they may cluster their writes in differing patterns such that both the queue length and the queue service rate vary among the DASD storage devices for both initial write and copy generating purposes. This means that the copies received at the remote site have a reasonably high probability of being in an out-of-order sequence much of the time and subject to delay or loss.
Asynchronously and independently executing applications and processors create a stream of write operations against local storage and remote sites, which stream is both queued and executed at different rates resulting in a near random ordered copy sequence. Since such data streams may contain large quantities of data, it is desired to minimize negative impacts on operation of primary and secondary data processing systems/sites by quickly retentively storing data (removing the data from the memory of host processors) while keeping the primary updating sequence intact at a secondary system/site. Updating directories and real data-storage devices based on asynchronous data transmissions is not desired because the sequence of primary system/site updating is not preserved. Therefore, it is desired to maintain addressability of all data while deferring directory updating to ensure maintenance of updating sequence at a remote or secondary system/site. It is also desired to minimize a number of DASD accesses to effect remote dual copying of a large plurality of data records.
At the secondary system the dual or secondary copy is retentively stored as soon as possible for reducing any risk of loss. This secondary retentive storage is independent of the sequence of updating in the primary system. The primary system determines when the secondary system is to arrange the data to preserve the critical updating sequence integrity. Then the secondary system sorts the received update indications of data in accordance with the primary system indicated update sequence for finding the last valid copy of each data record. Then the secondary system reads the valid copy of data from its data-storage system, arranges same in proper address sequence, then stores the data on the secondary DASD for retentive storage. Then all of the previous updates are erased (addressability is removed for an effective erase). Such secondary system back up requires up to three DASD accesses. One DASD access is required to initially retentively store all updates of the data. A second DASD access reads the valid data. Any outdated data is not read, but is erased (remove addressability for example) The third DASD access stores the valid data in a proper sequence. The first and second DASD accesses could be avoided by retentively storing all update data (even that update data that are to be discarded) in a non-volatile cache. Such avoidance may be more expensive to implement. Such multiple DASD accessing is desired to be reduced to but one DASD access without requiring intermediate caching for retentively writing the updated data in the secondary or back up data-storage system.