The present invention relates to data storage systems, and more particularly, to virtual tape systems that use physical tape caching during deduplication operations.
A virtual tape system (VTS) is a tape management system, such as a special storage device or group of devices and software, which manages data such that the data appears to be stored entirely on tape cartridges when portions of the data may actually be located in faster, more highly available types of storage media, such as hard disk drives (HDDs), Flash memory, etc. Programming for a VTS is sometimes referred to as a virtual tape server, although these terms may be used interchangeably, unless otherwise specifically indicated. A VTS may be used with a hierarchical storage management (HSM) system in which data is moved from one storage tier to another as the data falls through various usage thresholds to slower but less costly forms of storage media. A VTS may also be used as part of a storage area network (SAN) where less-frequently used or archived data may be managed by a single virtual tape server for any number of networked computers.
In prior art VTS's, at least one virtual tape server is coupled to a tape library comprising numerous tape drives and tape cartridges. The virtual tape server is also coupled to one or more direct access storage devices (DASDs), each possibly comprised of numerous interconnected HDDs, Flash memory, or any combination thereof.
The DASD functions as a tape volume cache (TVC) of the VTS subsystem. When using a VTS, the host application writes tape data to virtual drives. The volumes written by the host system are physically stored in the TVC (e.g., a RAID disk buffer) and are called virtual volumes. The storage management software within the VTS copies the virtual volumes in the TVC to the physical cartridges owned by the VTS subsystem. Once a virtual volume is copied or migrated from the TVC to tape, the virtual volume is then called a logical volume. As virtual volumes are copied from the TVC to a tape cartridge (tape), they are copied on the tape end to end, taking up only the space written by the host application. This arrangement maximizes utilization of tape cartridge storage capacity.
The storage management software manages the location of the logical volumes on the physical cartridges, and a user typically has no control over the location of the data. When a logical volume is copied from a physical tape cartridge to the TVC, the process is called recall and the volume becomes a virtual volume again. The host cannot distinguish between physical and virtual volumes, or physical and virtual drives. Thus, the host treats the virtual volumes and virtual drives as actual tape cartridges and drives and all host interaction with tape data in a VTS subsystem is through virtual volumes and virtual tape drives.