This invention relates in general to data storage management. More specifically, the invention relates to reclaiming sequential storage media, such as virtual tape.
Sequential media reclamation is a process in which space is reclaimed on sequential media after portions of the data stored on the media are no longer needed. The most common type of sequential media for which this process is performed is magnetic tape. Storage management systems may implement operations called “reclamation” or “recycling” to reclaim space by copying the data that is still needed from one sequential media volume to a new volume so that the source volume can be reclaimed or reused. This is typically done after a sequential volume has filled and the usable data on the volume falls below a specified threshold, typically established by the product user or administrator. The operation typically requires substantial database update activity in addition to data movement because the data location on the new volume needs to be updated in the database so that the data can be later located when needed by a restore or retrieve operation.
With certain storage management systems, backup or archive data stored on sequential media expires when a management policy (such as a retention or versioning policy) dictates that the data should no longer be retained. Because multiple files are stored sequentially on the media and each of the files may expire at differing times, segments of the data stored on the media are no longer needed over time. Upon expiration of a data object, a storage management server may logically delete the data object by removing references to the locations at which the data object was stored. Such expiration of data objects, as well as deletion of data objects for other reasons cause logical vacancies to develop in the storage volumes. Such logical vacancies are space that is taken up by objects that are no longer needed. Since sequential media allows data to be appended, but does not allow for internal sections of the media to be overwritten, the logical vacancies cannot be reused unless the media is re-written from the beginning.
FIGS. 1 through 3 illustrate sequential media reclamation. In FIG. 1, a first storage volume 105.1 is used to store a series of data objects labeled as objects A through F 110. Not shown in the figure is a storage management server that controls the reading and writing operations to the storage volume. These data objects 110 are stored sequentially on the storage volume. FIG. 1 illustrates that a usable portion 115 remains after the last data object 110. For illustrative purposes, element 120 denotes the location of the end of the final data object and thus the beginning of the usable portion 115.
FIG. 2 represents the same storage volume at a later point in time as compared with FIG. 1. During the intervening time, data objects G through M have been appended to the volume and logical vacancies 205 have been introduced because of deletion of objects B, E, G, J and L. For example, data objects may be deleted due to a retention policy or versioning policy. While the logical vacancies appear to be void of data in FIG. 2, this is for illustrative purposes only. Traditionally, when a data object is removed, it is only logically deleted, creating a logical vacancy. This may be accomplished by removing the references to the data object in the storage management server database.
FIG. 3 illustrates a new storage volume 105.2, which has been created from the physical reclamation of the first storage volume 105.1 from FIG. 2. By writing the remaining data objects 110 that are stored on the first storage volume 105.1 sequentially to this second storage volume, the logical vacancies 205 are reclaimed and thus the usable portion 115 of the second storage volume 105.2 is larger than the usable portion of the first storage volume was. After reclamation, data objects that were formerly on the first storage volume shown in FIG. 2 now are stored on the second storage volume shown in FIG. 3. This enables the first storage volume to be reused, with data objects again being written to it starting from the beginning of the tape.
The physical reclamation process described above utilizes resources of the storage management server. For example, copying the data objects from the first storage volume to the second storage volume requires server resources. As another example, reclamation typically requires substantial database update activity because the data location on the new volume needs to be updated in the database so that the data can be later located when needed by a restore or retrieve operation.