The invention relates generally to hierarchical data storage systems and more particularly to data storage management in such systems.
Data storage systems are available for use by host processors to store the increasingly large amounts of data that are being generated, accessed, and or analyzed by applications running on those host processors. Today the capacity of such data storage systems is measured in terra bytes.
A typical data storage system might include three levels of data storage, namely, a cache memory, an array of disks, and a tertiary storage device, such as a tape drive or a farm of tape drives, that can be connected to the system through an appropriate interface. The cache memory, which may be implemented by high speed RAM (Random Access Memory), provides storage for data that is being accessed by the applications running on the host processors. It is the working memory. The array of disks, which provides much larger storage capacity than the cache memory and might include hundreds of disk devices, provides the more permanent storage for the data. The disks are not practical for use as the working memory because they are much slower than the cache memories. Data is staged from the array of disks (i.e., the slower storage) to cache memory (the faster storage) when it is needed by the host processors and it is destaged back from cache memory to the array of disks when it is not needed.
The tertiary storage provides the most permanent storage for the data. Since the tape drives that are often used for the tertiary storage are much slower than the disk devices, the tertiary storage is only used for data that is accessed very infrequently.
Known techniques are available for moving data from tertiary storage to cache storage and for determining what data should be moved to tertiary storage. Typically, the decision on what data should be destaged is based on some measure of access frequency. In general, the systems which include such multiple levels of storage are referred to as hierarchical data storage systems and the techniques for managing the data flow between the levels are referred to generally as hierarchical storage management.