1.1. Field of the Invention
The present invention relates to the field of storage management of sequential media and more particular it relates to autonomous reclamation processing of virtualized sequential media such as virtual tapes in a virtual tape library.
1.2. Description and Disadvantages of Prior Art
Storage management applications such as IBM Tivoli Storage Manager, Symantec NetBackup or EMC Legato Networker perform different data storage operations such as backup, archiving and hierarchical storage management. (IBM and Tivoli are trademarks or registered trademarks of International Business Machines Corporation and/or its affiliates in the United States, other countries, or both. Symantec and NetBackup are trademarks or registered trademarks of Symantec Corporation and/or its affiliates in the United States, other countries, or both. EMC, Legato, and Networker are trademarks or registered trademarks of EMC Corporation and/or its affiliates in the United States, other countries, or both.) Storage management applications use sequential media such as virtual tape for cost efficient storage mainly for data which is accessed more often than data on physical tape.
Virtual tapes are typically emulated by a virtual tape library. A virtual tape library according to prior art comprises a computing system executing a tape virtualization software. This tape virtualization software emulates virtual tape drives, virtual tape libraries and virtual tapes. A virtual tape library is connected via an interface and a network—such as a Storage Area Network (SAN) or Local Area Network (LAN)—to the storage management application. The storage management application “sees” the virtual tape devices and virtual tapes just a real tape devices. When the storage management application writes a virtual tape than this virtual tape is stored on a disk system also comprised in the virtual tape library. Each virtual tape in a virtual tape library has a unique serial number which is also called VOLSER. The VOLSER allows the unique identification of a virtual tape in a virtual tape library.
A virtual tape—just like a real tape—cannot be written in a random fashion but rather sequentially from the beginning to the end. Thus it is not possible to write data to any position on tape but only at the position and beyond where the last write operation has ended. When data on a tape needs to be overwritten then the tape must be written again starting from the beginning.
Over time the data which has been written to a virtual tape expires which causes data or parts of the data to become inactive. The remaining data is still active. FIG. 1 shows and example for active 102 and inactive data 104 on virtual tape 100.
Active data is the data which is still valid and might be used for restores. Inactive data has typically expired and is not valid any more. Thus inactive data is essentially represented by the entire tape capacity minus the active data. Inactive data is a waste of storage space in the disk system of the VTL because this data is not longer needed and could potentially be deleted.
As more data becomes inactive over time on a virtual tape as more storage capacity is wasted on virtual tape because the spots with inactive data 104 cannot be overwritten selectively. In addition, the data on tape typically does not expire in a sequential order leaving gaps with inactive data between active data portions on tape as shown in FIG. 1.
For example a virtual tape according to prior art such as IBM TS1130 emulated in a Virtual Tape Library IBM TS7500 has a capacity of 1 TB. If such virtual tape has 50% active data left then 500 GB of storage capacity is wasted—because its still allocated by the virtual tape library, but not referenced anymore by the application software. A virtual tape can only be reused when all active data has expired or when all active data has been moved to another virtual tape. The virtual tape needs to be empty to be re-used for new backups from the beginning.
Moving the active data 102 to another virtual tape is also called reclamation. Storage management applications implement the reclamation process. The reclamation process monitors the amount of active data on each virtual tape which has been written full. Typically there is a threshold the user can set—also called the reclamation threshold—and if the amount of active data falls below that threshold the storage management software automatically copies the remaining active data from that source virtual tape to a target virtual tape which is in a empty or filling status at the moment of time. At the end of the reclamation process the source tape is empty and can be re-used from the beginning of tape.
This reclamation process according to prior art has the following disadvantages:
1. The reclamation process is executed by the storage management server which consumes additional computing resources on the storage management server.
2. The reclamation process requires two virtual devices: one to read the data from and one to write the data to.
3. During reclamation the network between the storage management server and the virtual tape device is utilized.
4. Data sets or files which might belong together might be written to two distinct virtual tapes during reclamation in case one output virtual tape gets full. This causes longer restore times.
5. In order to keep the impact of the above two reasons low the recommendation is typically to start the reclamation process when 30% or less active data resides on a sequential medium. This however causes a massive decrease in usable storage capacity—theoretically 70% in practice usually 50%.
6. Virtual tapes which after reclamation contain no active data still consume the entire capacity (inactive data) after reclamation processing, because the space is only released when a reclaimed virtual tape is rewritten from the beginning of tape (host block 0) by the application software.
Thus a system and method is needed which overcomes these disadvantages of reclamation processing according to prior art.
1.3. Objectives of the Invention
The objective of the present invention is to provide an improved method and system for managing virtual tapes in a virtual tape library system.