The disclosure relates generally to automated data storage systems and more particularly, to a method and a computer program product for determining and performing an efficient elimination of access to data stored on tape media of a tape cartridge.
A virtual tape system is a tape management system such as a special storage device or group of devices and software which manages data such that the data appears to be stored entirely on tape cartridges when portions of the data may actually be located in faster, hard disk storage. Programming for a virtual tape system is sometimes referred to as virtual tape server (VTS), although these terms may be used interchangeably, unless otherwise specifically indicated. A virtual tape system may be used with hierarchical storage management (HSM) system in which data is moved as the data falls through various usage thresholds to slower but less costly forms of storage media. A virtual tape system may also be used as part of a storage area network (SAN) where less-frequently used or archived data can be managed by a single virtual tape server for a number of networked computers.
In prior art virtual tape storage systems, such as International Business Machines (IBM) Magstar Virtual Tape Server, at least one virtual tape server (VTS) is coupled to a tape library comprising numerous tape drives and tape cartridges. The VTS is also coupled to a direct access storage device (DASD), comprised of numerous interconnected hard disk drives.
The DASD functions as a tape volume cache (TVC) of the VTS subsystem. When using a VTS, the host application writes tape data to virtual drives. The volumes written by the host system are physically stored in the tape volume cache (e.g., a RAID disk buffer) and are called virtual volumes. The storage management software within the VTS copies the virtual volumes in the TVC to the physical cartridges owned by the VTS subsystem. Once a virtual volume is copied or migrated from the TVC to tape, the virtual volume is then called a logical volume. As virtual volumes are copied from the TVC to a Magstar cartridge (tape), they are copied on the cartridge end to end, taking up only the space written by the host application. This arrangement maximizes utilization of a cartridge storage capacity.
The storage management software manages the location of the logical volumes on the physical cartridges, and the customer has no control over the location of the data. When a logical volume is copied from a physical cartridge to the TVC, the process is called recall and the volume becomes a virtual volume again. The host cannot distinguish between physical and virtual volumes, or physical and virtual drives. Thus, the host treats the virtual volumes and virtual drives as actual cartridges and drives and all host interaction with tape data in a VTS subsystem is through virtual volumes and virtual tape drives.
One issue of VTS systems is the management of data within the tapes. The VTS system may have a number of duplicate, invalid, latent or unused copies of data. After a virtual tape volume is created and/or modified (one or more records are written to the volume) and closed, the virtual tape volume is copied onto the physical tape (logical) volume. The image of the virtual volume copied to a physical volume when the virtual volume was closed is a complete version of the virtual volume at the point in time the virtual volume was closed. If a virtual volume is subsequently opened and modified, when the virtual volume is closed, that image of the virtual volume is also copied onto physical tape, however the virtual volume does not overwrite the prior version of the volume since the virtual volume may have a different size than the previous version. So at any point in time, there may be several versions of the same volume serial number that reside on one or more physical tape volumes.
Moreover, physical volumes within a VTS are arranged in groups that are called “pools,” with each physical volume including one or more logical volumes. Each of the physical volumes managed by the VTS system is assigned to one of 32 pools, for example. It is understood that each pool of physical volumes is assigned a name and may have one or more parameters associated therewith. For example, typical parameters associated with a pool include, but are not limited to: a media type (e.g. physical volumes having 10 Gigabyte tape or 20 Gigabyte tape); and a rule(s) for managing volumes in a pool. One rule may involve the concept of “reclamation” whereby the VTS monitors what percentage of data associated in a particular physical volume is still valid. That is, over time, data space occupied by a logical volume needs to be reclaimed from a physical volume when the data is no longer used or needed by the host, e.g., the physical volume has expired. Thus, if any volume(s) in the pool falls below a reclaim percent threshold, then a reclamation process will be performed to take the valid logical volume(s) off the physical volume and put the valid logical volume on another physical volume—potentially combining multiple partially full physical volumes and filling up the other.
If a virtual volume is removed from the physical volume and put on to another physical volume, the data on the first physical volume is deleted but has not been overwritten, and thus, the data may be accessed and recovered. Further, data associated with the most current version of a virtual volume may be expired or considered latent or unusable by the customer, but the virtual volume still will exist on the physical tape volume and could be accessed.
Recently, enterprises have become more dependent on the ability to store, organize, manage and distribute data. Accordingly, “information life-cycle management,” the process of managing business data from conception until disposal in a manner that optimizes storage, access, and cost characteristics has become increasingly important. In particular, the significance of how data is “deleted” or disposed of has increased as confidential data has begun to play a more vital role in business transactions and stricter regulations are imposed on maintaining customer privacy.
To protect confidential or sensitive data (e.g., credit card information, social security number) and to maintain customer privacy it is advantageous to eliminate access to certain data by performing a long erase so that the data is unrecoverable and inaccessible. Eliminating access to data is defined herein rendering data permanently unreadable by any reasonable means. The method of performing a long erase from the beginning of tape to the end of tape has worked sufficiently for all media types regardless of media capacity and drive capabilities. However, as newer media types and drive capabilities have been developed this method has become less efficient and less effective. The present disclosure provides a method, system and computer program product for determining and performing an efficient process of eliminating access to data on a tape cartridge.