In a large data processing system having one or more disk storage subsystems, the workload of the disk storage subsystems usually fluctuates over time. Typically there are short-term small variations in the load expressed as input/outputs (I/Os) per second (IOPs) that are “noise-like” and thus hard to predict. Longer term, such variations may be diurnal and hebdomadal variations that are more predictable. At peak workload times the demand of the IOPs on some storage devices or some logical units within them may be so high so as to lead to long latencies for I/O operations.
One existing solution to this problem is to size the system to have a large enough capacity to cope with the peaks. However, the durations of these peak workloads may be insufficient to financially justify sizing the system capacity for them.
Another possible course of action is to detect the occurrence of a peak workload, and then do something in response. It may be possible, for example, to spread the placement of the heavily used data to a larger number of physical devices in order to exploit their I/O capability. It is well-known to those of ordinary skill in the art that striping a heavily used dataset across multiple real devices allows a larger workload to be sustained, for instance, as is known from Redundant Arrays of Independent Disks (RAID) 0 storage arrangements.
In a data processing system that has a variety of storage devices, some with higher performance than others, yet another course of action is to move, or “migrate,” heavily accessed data onto the faster devices when the demand arises. This is sometimes termed “Adaptive Data Placement.”
The difficulty with these latter approaches is that making a copy or moving data increases the IOPs demand just at the worst time, when demand is already too high. It would thus be more desirable to provide adaptive data placement without increasing IOPs demand at busy times. For this and other reasons, there is a need for the present invention.