1. Field of the Invention
The present invention relates to a system, method, and program for copying data from one virtual tape server to another virtual tape server in a peer-to-peer environment.
2. Description of the Related Art
In prior art virtual tape storage systems, hard disk drive storage is used to emulate tape drives and tape cartridges. In this way, host systems performing input/output (I/O) operations with respect to tape are in fact performing I/O operations with respect to a set of hard disk drives emulating the tape storage. In the prior art International Business Machines (IBM) Magstar Virtual Tape Server, one or more virtual tape servers (xe2x80x9cVTSxe2x80x9d) are each integrated with a tape library comprising numerous tape cartridges and tape drives, and have a direct access storage device (DASD) comprised of numerous interconnected hard disk drives. The operation of the virtual tape server is transparent to the host. The host makes a request to access a tape volume. The virtual tape server intercepts the tape requests and accesses the volume in the DASD. If the volume is not in the DASD, then the virtual tape server recalls the volume from the tape drive to the DASD. The virtual tape server can respond to host requests for volumes in tape cartridges from DASD substantially faster than responding to requests for data from a tape drive. Thus, the DASD functions as a tape volume cache for volumes in the tape cartridge library.
Two virtual tape servers can be combined to create a peer-to-peer virtual tape server. In a peer-to-peer virtual tape server, two virtual tape servers, each integrated with a separate tape library, can provide access and storage for the same data volumes (i.e. peer-to-peer environment). By providing two virtual tape server subsystems and two libraries, if an operation to recall a file from one virtual tape server subsystem and tape library fails, then the file may still be recalled from the other virtual tape server subsystem and tape library. This redundant architecture provides greater data and tape availability and improved data shadowing in the event a tape or VTS in one subsystem is damaged. Therefore, when a host system writes to the storage device, the data will be saved on both virtual tape servers. However, rather than writing to both virtual tape servers simultaneously, which would be a drain on system resources, a virtual tape controller connecting the two virtual tape servers will write the logical volume to one of the virtual tape servers when the host closes the logical volume. An example of a virtual tape controller is the IBM AX0 Virtual Tape Controller (xe2x80x9cAX0 VTCxe2x80x9d) which acts as an intelligent switch between the two virtual tape servers and transparently connects the host computers with the virtual tape servers. Then, the logical volume is copied by the virtual tape controller from one virtual tape server to the other virtual tape server.
The synchronization process between the virtual tape servers can occur immediately or be deferred based on user preferences. Often a host user will set the backup process to occur at a later time companies that operate on a cyclical basis. For example, a Wall Street firm may desire higher peak host input/output performance during trading hours (and not have the backup process slow down their computers), and chose to defer the backup process between the two virtual tape servers until the trading day has ended. In addition, the IBM Peer-to Peer Virtual Tape Server would operate in deferred mode if one of the VTS subsystems fail.
In operating a virtual tape server, especially one that has a lot of host write activity, space in the VTS DASD cache needs to be continually made available for newly written volumes. However, when operating in deferred mode, if too much data is stored in the DASD before the copy operation is performed, uncopied data may be deleted before being copied to the other virtual tape server, where the oldest data is erased first regardless of whether the data was copied or not. In such cases, the only copy of the data will exist on a physical tape in the tape library, however, backing up the other virtual tape server from a tape drive causes large delays in the backup process. The penalty for a tape drive recall is slightly over a factor of ten in copy throughput. This factor of ten penalty is so severe on the IBM Peer-to-Peer Virtual Tape Server, that if all the logical volumes were on tape, the copy process could never xe2x80x9ccatch-upxe2x80x9d to the host. Thus, there is a need in the art for improved mechanisms for backing up data from one virtual tape server to another in the deferred mode.
Provided is a method, system, and an article of manufacture for maintaining data in two storage devices, wherein the data is comprised of a plurality of data sets. A flag is maintained for each data set indicating whether the data set has been copied to the other storage device. In addition, a timestamp is maintained for each data set. Each time a data set is modified or newly created, the data set is flagged as an uncopied data set using the flag associated with the data set. The preferred embodiments modify the timestamp for each uncopied data set by adding a period of time, and thus give preference because when space is needed in the storage device, the data set with the oldest timestamp will be deleted first.
In further embodiments, once the uncopied data set is copied from one storage device to the other storage device, the flag of the newly copied data set is changed to indicate that the data set has been copied. The timestamp for the newly copied data set is then set back to normal by subtracting the same period of time added on when the data set was flagged as needing to be copied.
In still further embodiments, the step of initializing the copy operation from one storage device to the other storage device comprises placing the flag of each data set into a flatfile, reviewing the flag of each data set from the flatfile and searching the flatfile to locate an uncopied data set.
An advantage to the preferred embodiments is that uncopied data sets will be given preference to be retained in the cache over copied data sets without forcing any conditions that would in themselves cause a storage device to fail because a data set could not be copied from the storage device.