1. Field of the Invention
This invention relates to systems, methods, and computer program products for offloading volume space reclamation operations to virtual tape systems.
2. Background of the Invention
There are various different applications on mainframe operating systems that store objects, such as datasets or files, on tape volumes. Examples include DFSMS (Data Facility System Managed Storage), HSM (Hierarchical Storage Management), DFSMS OAM (Object Access Method), and TSM (Tivoli Storage Manager). HSM and TSM are used to migrate objects from disk to tape or to make backup copies of objects. OAM places object data on tape volumes that may be a backup or original data. These applications typically utilize databases to keep track of object names, the tape volumes the objects are written to, how many tape records the objects contain, and the locations of the objects on the tape volumes. In certain cases, the locations of the objects may be recorded using the logical block ID returned by the tape subsystem.
Over a period of time, objects residing on a tape volume may no longer be needed or may be replaced by newer versions of the objects. The records for these objects may be deleted from an application's database. However, the objects may continue to occupy space on the tape volume. One of the characteristics of a tape volume is that data cannot be modified without overwriting data from the point of the modification to the end of the tape volume. Thus, data cannot be updated in place like it can be on disk-drive-based volumes. This results in tape volumes having certain parts of their capacity occupied by valid data and other parts occupied by invalid data.
Objects that are written to a tape volume may vary significantly as to how long they are valid. Some types of data, such as long term archival data, may never become invalid. Other types of data, such as data that is modified frequently, may leave previous invalid versions of the data distributed across one or more tape volumes. Thus, invalid objects may create significant amounts of wasted space on tape volumes.
Applications such as DFSMS HSM and TSM attempt to address this problem by employing a mechanism to recover the wasted space on tape volumes. This mechanism typically includes two elements. First, the application periodically determines if the valid data on a tape volume it manages has fallen below a specified threshold. This threshold may be a percentage of the amount of data the tape volume can hold when completely full. Second, if the valid data has fallen below the threshold, the application copies the still valid objects on the tape volume and writes them to a new tape volume, thereby making all data on the old tape volume invalid and allowing it to be overwritten with new data. The application then updates its database records to reflect that the valid objects reside on a new tape volume and to indicate where the records are located on the new tape volume. The database is also updated to reflect that the old tape volume has no active data and thus can be reused as “scratch.”
The above-stated process is called “Recycle” on DFSMS HSM and “Reclamation” on TSM. Such a process (hereinafter generally referred to as a “reclamation process” or “reclamation operation”) consumes significant resources on the host system. To perform the reclamation process, the host system needs to read the still valid data from a source tape volume and write the valid data out to a new tape volume. This consumes significant I/O bandwidth and CPU cycles. The reclamation process also requires two tape drives, making these drives unavailable for production use. The reclamation process further requires CPU cycles to update the application's database to reflect the new location of the valid objects. For many customers, the overhead of running the reclamation process can be significant and thus should not be run during periods of high host workload.
For a virtual tape system such as IBM's TS7720 or TS7740, the tape volume that an application sees is actually a file structure residing on a file system (referred to as a “virtual tape volume”). Disk drives provide the underlying storage for the file system. The control program within the virtual tape system virtualizes the underlying file structure so that the application sees standard tape records. As a result, an application cannot tell the difference between a real physical tape volume and a virtual tape volume presented by the virtual tape system. All aspects of a physical tape volume, including logical block IDs and positioning, are emulated for a virtual tape volume.
In view of the foregoing, what are needed are systems and methods to offload volume space reclamation operations to virtual tape systems. Ideally, such systems and methods will significantly reduce host overhead and resource utilization associated with the reclamation operations, thereby freeing up resources for production use. Further needed are systems and methods to perform reclamation operations without requiring a host application, such as HSM, to update its database.