1. Field of the Invention
This invention generally relates to the storage of data for use in data processing systems. More particularly, this invention relates to maintaining data integrity and consistency in redundant storage systems.
2. Description of Related Art
Nearly all data processing system users are concerned with maintaining back-up data in order to insure continued data processing operations should their data become lost, damaged or otherwise unusable. Such back-up operations can be achieved through a variety of procedures. In one approach, copies of data on a primary storage device are made on the same or other media such as magnetic tape to provide an historical backup. Typically, however, these systems require all other operations in the data processing system to terminate while the backup is underway.
More recently disk redundancy has evolved as an alternative or complement to historical tape backups. Generally a redundant system uses two or more disk storage devices to store data in a form that enables the data to be recovered if one disk storage device becomes disabled. For example, a first disk storage device stores the data and a second disk storage device mirrors that data. Whenever a transfer is made to the first disk storage device, the data also transfers to the second disk storage device. Typically separate controllers and paths interconnect the two disk storage devices to the remainder of the computer system. One advantage of this type of system is that the redundant copy is made without interrupting normal operations.
Several systems have been proposed for providing concurrent backups to provide the advantage of a tape backup without interrupting normal operations. For example, U.S. Pat. No. 5,212,784 to Sparks discloses an automated concurrent data backup system in which a central processing unit (CPU) transfers data to and from storage devices through a primary controller. The primary controller connects through first and second independent buses to first and second mirrored storage devices respectively (i.e., a primary, or mirrored, storage device and a secondary, or mirroring, storage device). A backup controller and device connect to one or more secondary storage devices through its bus. Normally the primary controller writes data to the primary and secondary data storage devices. The CPU initiates a backup through the primary controller. In response the backup controller takes control of the second bus and transfers data from one secondary data storage device to the backup media. Applications continue to update the primary and any additional secondary storage devices. After a backup operation is completed, the primary controller resynchronizes the storage devices by updating the secondary storage device that acted as a source for the backup with any changes that occurred to the primary data storage device while the backup operation was underway.
U.S. Pat. Nos. 5,241,668 and 5,241,670 to Eastridge et al. disclose different aspects of concurrent backup procedures. In accordance with these references a request for a backup copy designates a portion of the stored data called a xe2x80x9cdatasetxe2x80x9d. For example, if the data storage devices contain a plurality of discrete data bases, a dataset could include files associated with one such data base. In a normal operation, the application is suspended to allow the generation of an address concordance for the designated datasets. Execution of the application then resumes. A resource manager manages all input and output functions between the storage sub-systems and associated memory and temporary memory. The backup copy forms on a scheduled and opportunistic basis by copying the designated datasets from the storage sub-systems and updating the address concordance in response to the copying. Application updates are processed during formation of the backup copy by buffering the updates, copying the effected uncopied designated datasets to a storage sub-system memory, updating the address concordance in response to the copying, and processing the updates. The designated datasets can also be copied to the temporary storage memory if the number of designated datasets exceeds some threshold. The designated datasets are also copied to an alternate memory from the storage sub-system, storage sub-system memory and temporary host memory utilizing the resource manager and the altered address concordance to create a specified order backup copy of the designated datasets from the copied portions of the designated datasets without user intervention.
Still referring to the Eastridge et al. patents, if an abnormal event occurs requiring termination of the backup, a status indication is entered into activity tables associated with the plurality of storage sub-systems and devices in response to the initiation of the backup session. If an external condition exists that requires the backup to be interrupted, the backup copy session terminates and indications within the activity tables are reviewed to determine the status of the backup if a reset notification is raised by a storage sub-system. This enables the determination of track extents which are active for a volume associated with a particular session. A comparison is then made between the track events which are active and volume and track extents information associated with a physical session identification. If a match exists between the track extents which are active and the volume of and track extent information associated with a physical session identification, the backup session resumes. If the match does not exist, the backup terminates.
U.S. Pat. No. 5,473,776 to Nosaki et al. discloses a concurrent backup operation in a computer system having a central processing unit and a multiple memory constituted by a plurality of memory devices for on-line storage of data processed by tasks of the central processing unit. A data backup memory is provided for saving data of the multiple memory. The central processing unit performs parallel processing of user tasks and a maintenance task. The user tasks include those that write currently processed data into the multiple memory. The maintenance task stops any updating of memory devices as a part of the multiple memory and saves the data to a data backup memory.
More recently the concept of redundancy has come to include geographically remote data facilities. As described in U.S. Pat. Nos. 5,544,347 to Yanai et al. for Remote Data Mirroring and 5,742,792 to Yanai et al. for Remote Data Mirroring (both assigned to the assignee of this invention), a computer system includes one or more local and one or more remote data facilities. Each local and remote data facility typically includes a data processing system with disk storage. A communications path, that may comprise one or more individual communications links, interconnects a local storage facility with a remote storage facility that is a mirror for the local storage facility. The physical separation can be measured in any range between meters and hundreds or even thousands of kilometers. In whatever form, the remote data facility provides data integrity with respect to any system errors produced by power failures, equipment failures and the like.
In prior art systems one dataset normally is stored in a single storage facility, so data consistency has been achieved whenever the remote storage facility exactly mirrors the local storage facility; i.e, the two facilities are in synchronism. Generally if a communications path comprising one or more communications links, fails (i.e., no data can be transferred over any of the communications links), the dataset remains in the remote storage facility, but no longer is updated. This becomes particularly important when data must be recovered because without consistency or synchronism data in a dataset that has not yet reached the remote or backup facility may be lost.
U.S. Pat. No. 5,720,029 to Kern et al. discloses one approach for providing a disaster recover system that utilizes a synchronous remote data shadowing to obtain a backup copy of data. A host processor at the primary, or local, site transfers a sequentially consistent order of copies of record updates to the secondary site for backup purposes. The copied record updates are stored on the secondary storage devices at the remote site that form remote copy pairs with the primary data storage devices. One track array, as an active track array, is used to set elements according to which tracks on the primary storage device receive record updates from the host processor at the primary site. The other track array, as a recovery track array, designates which record updates comprise the copy record updates currently transferred from the primary site to the secondary site for data shadowing and is used for recovery should an error interrupt the transfer. The track arrays are toggled once the consistency group transfer completes and a recovery track array becomes the active track array and the active track array becomes the recovery track array.
U.S. Pat. No. 5,649,152 to Ohran et al. discloses another method and system for providing a static snapshot of data stored on a mass storage system. In accordance with this approach a preservation memory is provided and a virtual device is created in that preservation memory. Whenever a write operation is to be performed on the mass storage system, a check is made of the preservation memory to determine if it contains a block associated with the mass storage write device. If no block is present, a copy of the block in the mass storage system at the block write address is placed in the preservation memory. Whenever a read is to be performed on the virtual device, a check is made of the preservation memory to determine if it contains a block associated with the virtual device read address. If a block exists, that block is returned in response to the read operation. Otherwise, a block at the virtual device block read address is returned from the mass storage device.
U.S. Pat. No. 5,680,580 to Beardsely et al. discloses a remote copy system that incorporates dynamically modifiable ports on storage controllers such that those ports can operate either as a control unit link-level facility or as a channel link-level facility. When configured as a channel link-level facility, a primary storage controller can appear as a host processor to a secondary storage controller. The primary storage controller can thereafter initiate multiple request connects concurrently for servicing a single I/O request. In this manner, a first available path can be selected and system throughput is improved. In this system host write commands at the primary storage controller are intercepted for a remote dual copy process. As a result of the intercept, the system determines whether a unit check write I/O flag is set. If it is not set, data is written to the primary cache or MVS and thereafter to the primary device. Once the data is stored at the primary storage controller, a connection is established to the secondary storage controller to allow a remote copy to proceed to transmit the data to the secondary storage controller.
Each of the foregoing references describes a different method of obtaining a backup and particularly addresses data consistency as between a specific storage controller and its backup facility whether that facility comprises a magnetic disk or tape device. The broad or basic object of these patents, particularly the Ohran et al. and Kern et al. patents, is to provide a method of tracking any changes that are in transit so that a disaster recovery will identify those items that need to be recovered.
Now storage facilities using redundancy including remote data facilities have become repositories for large databases. Recently, these databases and other types of datasets have grown to such a size that they are distributed across multiple independent storage controllers or facilities. This has led to a new definition of data consistency. In the following description we use xe2x80x9csynchronismxe2x80x9d in a conventional context and xe2x80x9cconsistencyxe2x80x9d in a modified context to account for such distributed datasets. As between a single storage controller and a single backup facility, such as disclosed in the foregoing Yanai et al. patents, the storage devices are in synchronism when the data at the local site corresponds exactly to the data on a secondary storage facility coupled by a single communications path. When multiple independent communications paths are involved with the transfer of data in different portions of a dataset, such as the journal log file and the data base, and the transfer of data over one path is interrupted, the remote storage facility associated with that communications path loses synchronism. In addition, even though other remote sites may remain in synchronism, the data across the remote storage facilities storing the dataset will no longer be consistent. If this occurs, the remotely stored dataset becomes corrupted. Conversely, if data transfers can occur over all the communications paths associated with a dataset and all the corresponding remote storage facilities are in synchronism with their local storage facility counterparts, the dataset is consistent. Consequently, what is needed is a method and apparatus for enabling a user to be assured that the data at the remote data facilities in such multiple communications path configurations is consistent, even when data can not be transferred across one or more communications paths.
Therefore it is an object of this invention to provide a method and apparatus for assuring consistency of data at one or more remote sites coupled to one or more local sites by multiple communications paths.
Another object of this invention is to provide such data consistency at a remote site transparently to any user application.
Still another object of this invention is to provide such data consistency to a remote site with minimal impact on other data processing operations.
In accordance with this invention, a host interacts with a first dataset copy. Transfers to a second dataset copy occur over multiple independent communications paths. If a transfer over one of the independent communications paths is not efficacious, all transfers from the first to the second dataset copy over all the independent paths are terminated. However, operations between the host and the first dataset copy continue. When the cause of the transfer interruption is corrected, transfers to the second dataset copy over all the independent communications paths resume.