Networked data storage is commonly used in enterprise environments to make data available to multiple users and automatically maintain copies of data on different storage devices which may be in different geographical locations in order to reduce the likelihood of data loss. Generally, data IO requests are sent from a user computer or network device to a primary storage array (R1) via a network. In order to mirror the production data stored by the primary storage array it is configured in a partner relationship with a remote secondary storage array (R2).
An individual storage array may be associated with multiple tiers of data storage resources having different performance characteristics, e.g., storage capacity and read/write speed. The cost per bit of stored data can vary widely for different storage resources. For example, a high-speed flash array is more costly per bit of storage than an array of disk drives, which in turn is more expensive than an array of optical disks. Performance of the overall storage array is at least in part a function of how effectively the different storage resources are utilized.
The storage array may also move data into and out of cache memory to enhance performance. Storage system cache is implemented as high speed memory, the cost of which is much greater than the cost per GB of storage in a persistent tier. A read request for data that is in the storage system cache can be satisfied by the storage system more efficiently and with a lower response time than if the storage system had to access the data in persistent storage. Consequently, deciding which data to place in cache effects performance of tiered or non-tiered storage.
Hierarchical storage management systems automatically move data between different storage tiers in order to effectively utilize the different storage resources. Most of the enterprise's data is typically stored on relatively slower storage devices. Data is moved from the slower storage devices to faster storage devices based on activity. For example, data files which are frequently accessed are stored on relatively faster devices, but may be moved to relatively slower devices if the files are not accessed for a predetermined period of time. When the file is accessed again it may be moved back to relatively faster storage. By moving files based on activity and utilizing less costly, slower storage devices for the relatively rarely accessed files the storage array achieves performance which may approach that of using a greater amount of more costly storage resources at a lower cost. Data management processing plans are not limited to movement of data between tiers in tiered storage. Data storage systems are required to make decisions associated with movement of data across storage system boundaries to balance the workload on a set of storage systems, to maintain network proximity to key users of the data, and for other reasons. The logic for implementing these decisions is typically performed by software or firmware on enterprise equipment. Consequently, modifying the data management processing plan logic generally involves significant effort and can create problems if not carefully executed.