1. Technical Field
The present invention relates generally to electronic digital file storage management systems and, more particularly, to a policy decision stash and corresponding methods for storage lifecycle management.
2. Description of the Related Art
Use of electronic data storage is increasing at an exponential rate. Much of this data is stored in ordinary Posix-like file systems. Further, much of this data is write-once and is to be retained for long periods of time.
The most commonly used disk storage devices are cheap, but not free and certainly not perfectly reliable nor absolutely durable. Accordingly, there is a need to migrate data to cheaper and/or more reliable media, a need to backup data, and a need to make replicas.
The vast amounts of data and numbers of files maintained makes the manual management of the lifecycle (e.g., creation, backup, archiving, migration, replication, deletion, and so forth) of data files burdensome, error prone, and impractical. Also, government regulations and business requirements demand that data management be conducted according to policy rules that conform to laws, practices, and so forth.
Even in a typical consumer home, there will be tens of thousands of files. For example, consider the operating system(s) and application program files, as well as financial documents and digital media photos (e.g., jpegs), music (e.g., MP3), and movies (e.g., MPEGs). In an enterprise with thousands of employees, customer databases, and so forth, there can be hundreds of millions of files to be managed.
Taken together, the multitude of legal and business requirements and the vast number of file objects to be managed necessitate the automated application of data management policy rules.
Currently, almost every implementation of a data management system for files operates by reading all of the catalog and/or directory entries for all of the files, from first to last, each time a management job is initiated. Each management job, be it backup, migration, deletion, and so forth, typically compares file pathnames and file attributes against a set of policy rules to determine which files should be acted upon.
The overhead of searching and reading the file catalogs and directories (scanning the metadata of a file system) whilst-performing policy rules driven maintenance operations such as backup and data migration expends a significant number of cycles, so much so that it is becoming a significant problem or expense in the operation of these systems, as exemplified by Tivoli Storage Manager(TSM) (data backup) and Tivoli Storage Manager for Space Management(HSM) (data migration, which is also known as hierarchical storage management).
These data management systems also typically lack any predictive, forecasting capability that could help both storage planning and understanding the implications of policy rules.
In one example of a prior art policy driven storage management system, such as IBM's mainframe DFSMS/HSM, as described by Jimenez, et. al., in “DFSMShsm Primer”, ISBN-0738421057, IBM Document No. SG24-5272-01, which is incorporated by reference herein, the system periodically re-evaluates policy rules on all objects, executing management actions as required but without caching decisions or predicting any decisions for future actions or for facilitating forecasting and planning.