1. Field of the Invention
The present invention relates to a computer program product, system, and method for data set management.
2. Description of the Related Art
Hierarchical Storage Management HSM is a data storage technique which automatically moves data between a primary and a secondary storage tier. HSM is sometimes also referred to as tiered storage. In HSM systems, data files that are frequently used are stored on high-speed storage devices of the primary storage tier, such as such as Solid State devices (SSD), or hard disk drive arrays. They are more expensive per byte stored than slower devices of the secondary storage tier, such as optical discs and magnetic tape drives. The bulk of application data is stored on the slower low-cost secondary storage devices and copied to the faster high-cost disk drives when needed. In effect, HSM turns the fast disk drives into caches for the slower mass storage devices.
The HSM system automatically migrates data files from the primary disk drives to the secondary tape drives if they have not been used for a certain period of time, typically a few months. This data migration frees expensive disk space on the primary storage devices. If an application does reuse a file which is on a secondary storage device, it is automatically recalled, that is, moved back to the primary disk storage. Due to this transparent file recall capability, the file remains accessible from a client application although it has been physically migrated to the secondary storage. HSM is implemented, for example, in the Tivoli® Storage Manager.
HSM may include storage tiering which is the placement of data on different devices in the multi-tiered storage based on the type of usage, performance and capacity requirements of the data and the characteristics of the devices. Storage tiering is often a manual process where administrators manually assign data to different locations within the multi-tiered storage system.
Automated storage tiering programs automatically manage data placement by observing the characteristics of data in the multi-tiered storage and automatically moving the data among the different tiers of storage. Automated storage tiering decisions are based on observation of workloads or pre-set administrator policies which statically partition resources. To determine where to store data in a multi-tier storage system, a storage manager program will analyze data access patterns, workloads on the storage devices, and usage of the devices and determine the tiers and devices within tiers on which to locate data.