Arrayed storage devices, such as RAID (redundant array of independent disks) disk arrays, are data storage devices that are intended to provide better performance and reliability than single media storage devices, such as individual hard disks. The performance advantage of arrayed storage devices over single storage devices comes from their ability to service read or write requests in parallel across numerous disks (i.e. in a RAID device) rather than having to service numerous read or write requests in serial on a single disk. On average, a RAID device can service more inputs and outputs (I/Os) in a given amount of time than a single disk can.
However, the degree of performance advantage achievable in an arrayed storage device over a single storage device is directly related to the degree to which I/Os are evenly distributed across the disks in the arrayed device (i.e. in a RAID device). Therefore, under circumstances in which numerous host computer I/O requests are all directed at data stored on disk #1, for example, of a 20 disk array, the 20 disk array provides no advantage over a single storage device. The heavily accessed data stored on disk #1 creates a bottleneck at disk drive #1, and any benefit to the host computer in using the arrayed storage device over a single storage device is significantly reduced with respect to the heavily accessed data.
Data striping is a technique used in RAID devices to distribute data and I/Os evenly across the array of disk drives in order to maximize the number of simultaneous I/O operations that can be performed by the array. Data striping concatenates multiple disk drives into one logical storage unit and partitions each drive's storage space into stripes that can be as small as one sector (512 bytes) or as large as several megabytes. The stripes are interleaved in a round-robin fashion so that the combined space is composed alternately of stripes from each drive. The type of application environment determines whether large or small data stripes are more beneficial. In an I/O intensive environment, performance is optimized when stripes are large enough that a record can potentially fall within one stripe. In data intensive environments, smaller stripes (typically one 512-byte sector in length) are better because they permit faster access to longer records.
Although data striping generally provides more parallel access to data stored on an arrayed storage device, it does not solve the problem of bottlenecking that can occur at a single disk drive when particular data is being heavily accessed on that drive. Data striping is blind with respect to whether or not data is or will be heavily accessed data. Furthermore, once the data is “striped”, it remains stored in the same location on the same disk. Therefore, if circumstances arise in which a host computer bombards a particular disk drive in an array of disks with I/O requests pertaining to certain data, a bottleneck will occur at the particular disk drive regardless of the fact that data striping was used to initially store the data.
Accordingly, the need exists for a way to determine if there is data stored in an arrayed storage device that is likely to be data that will be heavily accessed and to distribute this data across the storage components within the array such that the workload is more evenly distributed and I/O operations occur in a more parallel manner.