1. Field of the Invention
This invention relates generally to computer data storage management, and more specifically to techniques for the backup, archiving, recovery, and/or restoration of Migration Level Two (ML2) tape files.
2. Description of the Prior Art
Present-day computer data processing systems generally include a host processor having one or more central processing units. The host processor is supported by memory facilities and input/output (I/O) interfaces. One or more buses are often employed to provide interconnections between the various components of a computer data processing system.
The processing units execute instructions which specify the manipulation of data stored within the memory facilities. Therefore, the memory facilities must be capable of storing data required by the processor and transferring that data to the processor at a rate capable of making the overall operation of the computer feasible. The cost and performance of computer memory is thus critical to the commercial success of a computer system.
As computers manipulate ever-increasing amounts of data, they require larger quantities of data storage capacity. A typical data processing system includes both main memory and one or more peripheral storage devices. A data processing system having a plurality of peripheral storage devices arranged hierarchically is referred to as a data storage hierarchy.
In a data storage hierarchy, the term "primary data storage" refers to the data storage level having the highest level of performance and the lowest level of storage capacity. The primary data storage level is oftentimes referred to as "level 0" data storage. Secondary, or level 1, storage includes storage capacity equal to or greater than level 0 storage, but at reduced cost and performance. Similarly, level 2 data storage (also referred to as "auxiliary storage") has lower cost and performance than level 1 storage. However, level two storage includes a storage capacity equal to or greater than level 1 storage. Level two storage is often implemented using magnetic tape data storage drives. Data are accessed from these drives by means of relatively cumbersome mechanical tape mounting operations.
Various techniques have been developed to provide computer file data storage management. Storage management may be defined as the manipulation of a data storage hierarchy to balance system performance, data storage, and cost. A storage management system moves and copies data between different levels of the hierarchy to perform these balancing functions. The manipulation of the hierarchy may involve operations such as the deletion of data which are no longer being used.
Storage management includes several subcomponents, such as performance management, capacity management, space management, and availability management. Each of these subcomponents may involve the transfer of data between different levels of the hierarchy. Space management is the movement of data between different levels of the hierarchy so as to store data only in the most appropriate level of the peripheral storage hierarchy. For example, relatively active data should be stored in a relatively high performance level of the hierarchy, and relatively inactive data should be stored within a relatively low performance, low cost level of the hierarchy.
As data age, they are generally referenced less and less. Since such data are relatively less active, they should be moved to a lower performance level of the data storage hierarchy. The movement of data from one level of a data storage hierarchy to another is referred to as "migration", and may include data compression techniques to conserve data storage space. Transferring a file by migration may include the maintenance of a primary copy of a file in level 0 storage. The primary copy is, however, an empty file. The data in the file have already been transferred to the secondary copy of the file in level 1 storage.
Availability management is the backup of data within a data storage hierarchy to improve the likelihood of the data being available if and when they are needed by the host processor. The original or primary copy off the data is not deleted; an additional copy is generated and transferred to another portion of the data storage hierarchy. The secondary copy is typically stored on a different peripheral storage device from the primary copy to ensure the availability of the data. If the primary copy of the data is rendered unavailable, such as by device failure, the secondary copy of the data may still be referenced. The secondary copy of the data need not be stored in a different level of the data storage hierarchy, but this may nevertheless be desirable because the secondary copy is not likely to be as active as the primary copy.
Storage management has traditionally been performed manually. The owner of the data decides when to migrate or back up data, and where such migrated and backup files should be stored. Such decisions are time consuming, usually requiring a review of each file stored. The operations involved are often so intensive that manual reviews and decisions are not made until there is no alternative. For instance, a system user might not migrate any files to level 1 storage until all storage space in level 0 storage is filled. In large systems, or in any system storing relatively large amounts of data, it is simply impractical to perform manual data storage management.
In recent years, computer software has been developed to provide automated data storage management, thereby reducing the need for manual operations. One example of such a management system is the IBM Data Facility Storage Management Sub-System for Virtual Machines software package, hereinafter referred to as "DFSMS/VM". DFSMS/VM software is available from the International Business Machines (IBM) Corporation of Armonk, N.Y. DFSMS is a trademark of the IBM Corporation.
Systems such as DFSMS/VM commonly provide a function for backing up (archiving) data on a magnetic tape data storage drive. However, a function for optimizing space management on a magnetic tape storage drive is seldom offered. For example, one space management system providing for the migration of data is known as the migration level two (ML2) system. ML2 utilizes magnetic tape data storage drives to provide data storage at a hierarchical position of level 2. ML2 capability has only been offered in a select few software data storage management systems, including the IBM Data Facility Hierarchical Storage Manager system (DFHSM), which is a utility to the IBM Multiple Virtual Storage (MVS) series of software operating systems. DFHSM and MVS are available from the International Business Machines Corporation of Armonk, N.Y. DFSHM is a trademark of the IBM Corporation.
Although the use of ML2 data management techniques results in enhanced space management efficiency, prior art data management software systems are not without drawbacks. For instance, these software systems require relatively frequent tape mounting operations. Tape mounts are costly in terms of response time and installation resource commitment. As a practical matter, tape mounts must be carefully controlled, and should be avoided if at all possible.
If it is desired to provide a backup copy of data stored in ML2 level, presently existing software requires a tape drive mount to bring the data back into primary storage, and possibly a second tape mount to put the data into a backup repository. Furthermore, prior art software does not provide for any backups of files which have been migrated to tape. Prior art approaches resolve this problem by providing a tape dual copy function for the ML2 physical tape volumes. However, this resolution does not protect against loss of the primary file due to device failure or accidental erasure.
An additional drawback of prior art migration software relates to the recovery of a migrated file. It is undesirable to recover a migrated file to the primary storage area. An attempt to restore data back to the primary storage area may fail, because migration schemes permit the primary storage area to become overcommited. Furthermore, if the primary storage area had not become overcommited, there would have been no reason to migrate the file in the first place.
Prior art migration techniques do not optimally exploit design features which offer the potential for improved performance and simplicity. More specifically, many of the same processing steps are employed to implement the functions of managing the migration inventory/repository and managing the backup inventory/repository. Existing systems effectively execute the same steps twice: once for migration, and once for backup. No existing system attempts to consolidate these steps into a unified, more efficient operation. As a result, the migration and backup repositories are situated on separate tape volumes. Similarly, the migration and backup inventories are also segregated.
Presently-existing methods of performing an archive of a file that is migrated to ML2 tape require numerous processing steps. First, a recall of the file must be accomplished. This requires a tape mount, data movement from the ML2 tape repository back to primary storage, and updating the associated inventory. Next, the data are archived. The step of archiving the data involves three sub-steps. First, another tape mount must be provided. Second, data must be moved in order to place the required data on the archive (or backup) repository. Third, the associated inventory must be updated. After archival, the primary version of data (the version of data previously recalled) is erased.
In view of the foregoing considerations, there is a manifest need for an improved system which integrates Migration Level Two (ML2) and backup tape processing. The system should avoid tape mounts if possible. The system should include provisions for files backed up after the migration process, such that these files map be restored to the migrated state upon recovery. Such a feature is not offered by existing systems because these systems cannot back up migrated files. It would be desirable to have all tape volumes included within one large tape pool, as opposed to having separate tape volumes for the migration repository and the backup repository. It would also be desirable to integrate the migration inventory and the backup inventory. Furthermore, it would be desirable to develop an improved method for performing an archive of a file that is migrated to ML2 tape. Such a method should minimize the amount of data transfer operations and/or the number of tape mounts which are required.