The present disclosure relates to methods for managing and recovering data stored on storage devices.
Tiered storage techniques allow for the movement of data across different tiers of a data storage infrastructure between higher-cost, higher-performance storage devices (e.g., hard disk drives) and relatively lower-cost, lower-performance storage devices (e.g., magnetic tape drives). A tiered storage management system (or hierarchical storage management system) typically has the ability to move data dynamically between different storage devices based on predictions regarding which data will be most frequently requested or used in the future. Data that has not been requested or used within a certain period of time (e.g., after one week or month) may be archived (or migrated) to a lower-cost storage device.
Some operating systems for supporting systems that perform high-volume transaction processing, such as z/OS from IBM®, manage data by means of data sets. A data set may comprise a text or binary file that includes data, such as one or more records (e.g., medical records or insurance records) used by a program running on the system. A data set may also be used to store information needed by applications running on the system (e.g., source programs or macro libraries) or by the operating system itself (e.g., system variables).
The location of an existing data set may be determined if the data set name and a corresponding data storage volume are known. A data storage volume (or volume) may comprise a unit of a data storage device that is separately addressable and may be identified by a volume identifier (e.g., a six-character volume serial number or VOLSER). In some cases, if the data set is cataloged, then only the data set name is required in order to locate the data set. However, a cataloged data set may require that the data set have a unique name or identifier. A catalog may describe various data set attributes and provide a mapping to the storage devices or volumes on which the data set is located. In some cases, a catalog and a volume table of contents (VTOC) may reside on a direct access storage device (DASD) that is mounted during operation of the system. The VTOC may list the data sets that reside on the DASD, along with information about the location and size of each of the data sets on the DASD. The system may have a master catalog containing entries for each of the catalogs that are used on the system including pointers to the catalogs. During a system initialization, the master catalog may be read to acquire system-level data sets and to determine the location of the catalogs.
A generation data group may comprise a collection of related data sets. Each data set within a generation data group may be referred to as a generation data set. In some cases, a generation data group may comprise a collection of historically related data sets that are arranged in a chronological order (e.g., successive updates to a particular file). An advantage to grouping related data sets is that all of the data sets in the generation data group may be referred to by a common base name. In some cases, the number of generation data sets in a generation data group may be limited such that once the maximum number is reached, the creation of a new generation data set leads to the deletion of the oldest generation data set in the generation data group, thereby ensuring that the maximum number of generation data sets in the generation data group will not be exceeded.
In some cases, a generation data set may be retrieved by using either a relative generation number or an absolute generation number. An absolute generation number may include a base name and a suffix in the form of GxxxxVyy, where xxxx is an unsigned 4-digit decimal generation number (0001 through 9999) and yy is an unsigned 2-digit decimal version number (00 through 99). For example, A.B.C.G0001V00 may be a generation data set 1 in a generation data group with a base name of “A.B.C.” A relative generation number may use a generation data group base name followed by a negative integer, a positive integer, or 0 enclosed in parentheses. For example, a generation data set may be retrieved using a relative generation number such as A.B.C(−1). When a relative generation number is used to catalog a generation data set, the system may assign an absolute generation number to represent the generation data set. The absolute generation number assigned may depend on the number last assigned and the value of the relative generation number that is specified. For example, if A.B.C.G0005V00 was the last generation data set cataloged, and a relative generation number of A.B.C(+2) is provided, then the next generation data set cataloged may be assigned the absolute generation number A.B.C.G0007V00. In some cases, the maximum number of generation data sets in a generation data group may be limited to 255 generation data sets.