1. Field of the Invention
The present invention relates to data processing operations, and in particular relates to data storage backup and restore operations.
2. Background of the Invention
Today's computer systems generate enormous amounts of data. The amount of data being stored on fixed media such as disk, and removable media such as tape or CD/DVD, is exploding as the cost for storing on these media types continues to drop.
Many computer systems use fixed media such as disk for general day-to-day computer operations, and then periodically archive their data onto some type of removable media such as tape or CD/DVD. The removable media can then be physically transported to an off-site location to help protect data in case of some destructive or catastrophic event at the main computer site such as a fire or explosion. The data stored at the offsite location could then be used to restore the data on a replacement or the same computer system.
Data on fixed media such as disk may also be backed-up periodically on-site, where data is copied to removable media and then deleted from the fixed media in order to free-up space on the fixed media device. The data can then be retrieved from the removable media if the data is subsequently needed. An example of this type of archive system would be for storing old email messages for a mail server, to free up additional disk space for the server's use when their disk file(s) approach capacity usage.
Automated library systems are very popular for automating this data backup and restore operation from fixed to removable media, where large arrays of storage cells are used to physically hold a plurality of removable media devices, and a robot or robotic subassembly is used to physically transport the media from a cell to a media drive for reading and writing data from/to the media without human intervention. An example of such a system is an L180 Automated Tape Library System available from Storage Technology Corporation, headquartered in Louisville, Colo.
When one or more removable media are removed from a tape library—such as for moving the media to a secure off-site location—the tape cartridges must be checked out of the automated library system similar to how books are checked out of a regular library. This allows the system software to keep track of the data cartridge(s) and whether or not the cartridge is physically resident in its local tape library. Software utilities commonly known as “Import” and “Export” are used to manage tape cartridge insertion into and removal from an automated library. Import and Export are typically used to move volumes to another location for long term archiving or for backup purposes, for workload balancing by moving data from one system to another, and for data interchange between systems. Import and Export utilities are also used to manage the import and export of data in a virtual tape environment, such as the environment provided by the Virtual Storage Manager™ system which is also available from Storage Technology Corporation (Virtual Storage Manager is a trademark of Storage Technology Corporation).
Many legacy software applications assumed a tape volume to be of a certain fixed storage capacity. Because of increases in tape storage capacity, it is now possible to store many of these legacy tape volumes onto a single physical tape cartridge. Also, in order to avoid this type of mismatch problem in today's computing environment, the concept of a logical or virtual volume has been defined so that the operating system or application program does not have to concern itself with the physical intricacies of the storage media. The logical/virtual volume appears to the operating system or application program as a volume having certain characteristics. However, the actual physical characteristics of the storage media may be, and typically are, much different than the storage characteristics of the logical/virtual volume being accessed. By defining and using logical or virtual volumes by software applications, it is now possible to decouple the intricacies of the physical storage media from the software application. Typically, an intervening tool within the overall system, logically located between the application program and the media drive, is used to perform logical to physical mapping so that the physical media and its associated characteristics are transparent to the software application. When a logical volume is defined to be smaller than the physical volume, it is possible to write a plurality of logical volumes onto the physical volume. In such a situation, a single physical tape cartridge is sometimes referred to as a multi-volume cartridge (MVC) because it contains a plurality of logical/virtual tape volumes. An individual logical/virtual volume within such multi-volume cartridge is sometimes referred to as a virtual tape volume (VTV).
The “Export” utility previously mentioned is an operation which copies one or more logical volumes to a physical cartridge or volume. The process of exporting identifies the relationships between the multiple VTVs and MVCs. This process can include the creation of additional copies of VTVs specifically for the export process followed by the description of the relationships of an existing set of VTVs and MVC(s), or it can just be the creation of a description of the relationships of a preexisting set of VTVs and MVC(s). After completing the export process, the cartridge can then be ejected from the library system for cartridge relocation to another system, for reasons previously described.
In similar fashion, the “Import” utility is an operation that can be used to copy one or more previously exported logical volumes from an exported physical volume into the local system environment. In some situations, the import process is not used to copy data into the system, but rather is used to populate the target system with the relationships between the VTVs and the MVCs. As previously described, a manifest file is created during the Export operation, and is a list of all the virtual volumes on one or more exported MVC cartridges, and is subsequently used by the Import utility when copying virtual volumes from the previously exported MVC volume(s) into the local system environment. The problem with today's systems is that a separate manifest file can be created each time an Export operation is run and it is on these occasions, when multiple manifest files are created, that a problem occurs. When an Import operation is subsequently run, each of these plurality of manifest files must be processed. This problem stems from the fact that this type of export process creates a manifest file of a unique set of VTVs and MVCs and it bears no relationship to any predefined group of VTVs and/or MVCs. The particular challenge is that when one of these types of manifest files is processed, the relationships can be quite complex. For example, if a storage class is defined that causes VTVs of this group to be pre-grouped as a collection on a unique set of MVCs, it is possible to export this storage class as a group and manage the group in a cumulative manner. In this situation, each manifest file is cumulative when it is created. However, if a set of VTVs for which there is no predefined grouping (storage class) are exported, the resultant manifest file is unique. It is independent and unrelated to earlier or later manifest files that pertain to these VTVs. When these manifest files are subsequently imported, the user/system operator has to determine the sequence of importing since two or more manifest files may contain different versions of the same dataset (VTV). The user/system operator also potentially needs to process a great number of these manifest files since there is not an easy way to tell which manifest files hold no valid data and therefore do not require processing.
Consider, for example, that a user wanted to create four export processes each day, and after two years decided to import this data into a new system as a result of a disaster. There would be 2,920 manifest files (365×4×2) to process. This is a massive operational issue. This problem is exacerbated in a virtual environment. The relationship between physical volumes and physical cartridges is simple. The physical volume(s) either exist on the physical cartridge(s) or they do not and typically a Tape Management System (TMS) will store all the information pertaining to which volume is current and which physical volume it resides on. When virtual volumes such as VTVs are loaded onto physical volumes such as MVCs, only the Virtual Storage Manager system where they were created has full knowledge of the complex relationships between the virtual and physical components. In each creation of a manifest file (of which there may be many), this full knowledge of the complex relationship is unique and complete. However, as soon as there is more than one manifest file that needs to be processed, the information contained in these multiple manifest files is insufficient to resolve all the relationships. It would therefore be desirable to provide a system and method to better manage a plurality of manifest files in a virtual data storage environment.