1. Field of the Invention
The present invention is directed generally to the management of direct access storage devices (DASD) and more particularly to a method and apparatus for controlling the migration of files in a computer system amongst different levels of storage.
2. Description of the Invention Background
All organizations that use computers are faced with the challenge of managing the data that are generated by the users of those computers. Everyone that uses a computer on a regular basis knows that it's only a matter of time until the computer's disk storage is filled with a flood of memos, spreadsheets, schedules, proposals, letters, data bases, electronic books, sound files, and any other possible item known to mankind that can be stored in electronic form. That problem is compounded by software that often creates files without the knowledge of the user. This explosion of data results in the eventual inability to create new data, as all the available disk storage space has been exhausted. Having long given up hope that users will manage their own files, and not having an unlimited budget for buying disk storage, most computer managers have turned to archival software to solve this problem, or at least delay the purchase of more disk storage.
Most computer systems maintain the date that a particular file was last accessed (LASTDATE). Archival software uses that date to determine which files have not been used for a while and should be archived or removed to make way for new data requirements and application growth. When the archival software determines that a file is "old enough", it assumes it will either no longer be used or at least that it is unlikely to be needed again. Typically, the first action is to move or "archive" the file from its current location to a new location, "lower" in the storage hierarchy and usually considered a less "expensive" location. This action is also referred to as migration.
Most archival products maintain files at multiple levels of a storage hierarchy:
Primary storage is disk storage where the most active files are kept. Users can access files here with no delay, but it is the most "expensive" type of storage. To the degree that primary storage is littered with files that have not been referenced for significant periods, it can be said that some percentage of primary storage is being wasted. In our implementation, primary storage is also known as migration level 0, storage level (ML0). PA1 Compressed storage also resides on disk, but contains multiple compressed and consolidated files. While this compression makes the storage less expensive, users accessing the files stored here must wait while the data are uncompressed and moved back to primary storage. While most archival systems do this automatically when the user attempts to access the data, there is still a short delay. In our implementation, this storage is also known as migration level 1, storage level (ML1). PA1 Offline storage resides on tape and is inexpensive to buy, but difficult to access. When users attempt to access a file that has been moved to offline storage, they must wait while a tape is mounted and the data are moved back to primary storage. While this process is also automated, the delays can be greater. In our implementation, this storage is also known as migration level 2, storage level (ML2). PA1 new applications and users are added to the system; PA1 existing applications change; PA1 continuous technology changes need evaluation; PA1 staffing changes within the organization; and PA1 organizations merge with other entities. PA1 creating and maintaining a data base containing size information and historical information about the use of data sets residing on the computer system; PA1 calculating a next reference date for certain of the data sets and a confidence level for each of the next reference dates from information in the data base; PA1 defining an amount of the highest level storage space which is to remain available; PA1 identifying which data sets should be migrated between the storage levels of the computer system based on the next reference dates, the confidence levels, the sizes of the data sets, and the amount of highest level storage space which is to remain available; and PA1 migrating the identified data sets.
It becomes clear that the essence of the problem quickly boils down to the question, which files should be archived, or how "old" is "old-enough"?
Because the operating system keeps track of the last reference date (LASTDATE) any inspection of any group of files allows for an easy computation of the AGE of a data set. That is, we define the AGE of the data set as the number of days since it was last referenced or: EQU AGE=LASTDATE-TODAY's DATE
All known techniques for space management are based on some form of what we will call parameter-based rules. Those rules are based on a common least-recently used (LRU) algorithm. That is, the set of parameters governing the choices made by the archival (space management) software are centered around the rule of moving the least recently referenced data files. More specifically, a rule is established that states that any data set that has an age greater than "n" days should be migrated from ML0 to ML1.
Because of the uncertainty in choosing a value for "n", the data movement is typically restricted to the nearest level of the storage hierarchy (ML1). Hence, the inactive data set is `staged` on ML1 where it will be poised for a relatively quick and "pain-free" recall to primary (ML0) or will continue to "age".
Because ML1 is a finite resource that requires its free space to be managed, a corresponding rule is set forth for residency on ML1. In the general case, any data set on ML1 that has an age greater than "m" days should be migrated from ML1 to ML2. Thus, a general rule for migration "policy" parameters can be represented by a set of two numbers (n, m). A data set is said to be eligible to be migrated if its age meets the criteria specified in the rules.
It is highly unlikely that one policy is ever satisfactory for a given DASD "farm" because of the wide diversity of application data and user constituencies. The lack of sufficient granularity in such a policy leads to gross inequities. Likewise, the lack of sufficient granularity or distinctions in migration policy leads to a variety of system inefficiencies caused by "bad" decisions. Inevitably, a set of policies or rules emerges to address perceived differences in the data residency requirements.
In IBM's System Managed Storage (SMS) scheme, the set of rules is composed of a series of management class (MGMTCLAS) rules. Each MGMTCLAS name represents a set of rules with which the archival software (i.e. in this case, IBM's product DFSMShsm or "HSM") will operate. The MGMTCLAS rule establishes whether a given data set is eligible to be migrated. The essence of the MGMTCLAS concept is depicted in the following table of examples:
______________________________________ MGMTCLAS "n" "m" Backup Criteria ______________________________________ STANDARD 20 40 BC1 SPECIAL1 20 40 BC2 SPECIAL2 20 0 BC1 SPECIAL3 5 60 BC1 etc. ______________________________________
Note that "BC1" represents some policy for incremental backup of files. It is important to see that the only difference between STANDARD and SPECIAL1 is a different backup policy (BC2). The influence of the backup policy will be underscored later.
There are a number of problems associated with parameter-based schemes like the MGMTCLAS scheme sketched above. Difficulties in choosing parameter values for "n" and "m".
One aspect of the dilemma is that if one sets the minimum migration age too low, that "aggressive" policy will cause too many files to be eligible for migration which can lead to "thrashing"--the unproductive movement of data files down and up in the storage hierarchy. Thrashing is an inefficient use of system resources and contributes to application delay and end-user frustration. The other aspect of this dilemma is that if one sets the minimum migration age too high, that "conservative" policy will waste space on primary DASD by allowing inactive files to reside there too long. That leads to exposure to free-space shortages and other problems for both the storage administrators and end-users. Ultimately, it can lead to the acquisition of more DASD hardware to relieve the constraints caused by such waste.
Difficulties in assigning MGMTCLAS rules to data sets.
Item 1 above describes the "definition" side of the problem. This item describes the "assignment" side of the same problem. That is, given that some arbitrary value(s) have been defined for MGMTCLAS, which data sets should be assigned to management class x, which should be assigned to management class y, etc? In other words, what scheme is used to take the total population of data sets (i.e. files) and assign them with an appropriate "policy" in the form of a MGMTCLAS rule.
In IBM's implementation of this aspect of storage management, a component called the "ACS routine" makes that assignment. Typically, such code is not very ambitious for a variety of reasons. It tends to start with an assignment of all data sets to some `standard` class and then deviate with assignments to `special` classes on an exception basis as needed. over time. The main distinction is some identification based on the name of the data set.
Failure to Consider File Size Properly.
Once a data set (file) is eligible to be migrated based on its age, it may be migrated no matter how small it is. Very small data sets do little to alleviate space occupancy conditions on a primary volume because only a small amount of space is being freed up. Yet if relatively small data sets find their way out to the ML2 layer of the storage hierarchy, they are exposed to the risk of needing to be recalled. The manual steps of locating a tape cartridge to service a demand recall for a small data set combined with the manual steps of refiling that tape in the tape library make this something that is simply not worth the risk. The data transfer time is negligible once the data is ready to be read (i.e. the tape mount is satisfied) and yet the application delay time and/or user frustration caused by such a wait for recall is significant. It is therefore simply not worth it to expose relatively small data sets to the risk of being recalled. IBM algorithms only consider space within a set of data sets with the same age. To view the inefficiency in such a scheme, consider the following simple table of examples which illustrate how small data sets are exposed to the risk of needing to be recalled (assumes AGE must be 15 or greater to be eligible):
______________________________________ Age Size Order ______________________________________ 18 days 1 track 1st 17 days 500 tracks 2nd 16 days 1000 tracks 3rd 16 days 12 tracks 4th ______________________________________
Proliferation of Complexities Due to Backup Criteria.
Because MGMTCLAS also contains all the criteria for backup policy (i.e. how often to backup changed files, how many backup versions to keep, etc.), there is a tendency for changes in migration policy and changes in backup policy to complicate each other. That is, to create a distinction in migration policy requires a new "row" in the MGMTCLAS table; to create a distinction in backup policy also requires a new row. Complex distinctions can require many combinations.
Impact of Organization Changes.
It has been shown above that there are certain inherent difficulties in coming up with anything but a simplistic and arbitrary scheme for migration policy. The problems are magnified when one considers the dynamic aspect of the storage management domain. That is, most organizations with DASD farms to manage encounter several or most of the following events:
Thus, even if one allows that a MGMTCLAS table of policies has been defined and an ACS routine written to assign policies in a manner that is acceptable in the short run, the dynamics of change will work toward undermining these schemes.
The foregoing difficulties illustrate the challenge of managing DASD resources. Keeping track of all those intersecting rules and the corresponding assignments can be very complicated in a large, dynamic, DASD installation. Because of the complexities, there is a tendency to resist making distinctions in policies to keep things simple to manage. Thus, there is a need for a product which can manage hundreds of thousands of heterogenous files from many different applications in an intelligent, cost-effective, user-friendly, manner.