A storage system is a processing system adapted to store and retrieve data on behalf of one or more client processing systems (“clients”) in response to external input/output (I/O) requests received from clients. A storage system can provide clients with a file-level access and/or a block-level access to data stored in a set of mass storage devices, such as magnetic or optical storage disks or tapes. The storage devices can be organized as one or more groups of Redundant Array of Independent (or Inexpensive) Disks (RAID).
Storage arrays may exhibit load imbalances from time to time in that some storage devices in a RAID array are accessed more frequently than others. As system load increases, those storage devices saturate, limiting overall system's performance while other storage devices are underutilized. Load imbalance may be caused by a variety of sources. For example, data blocks created at particular times may be accessed more frequently than those created at others. Thus, aggregate growth can create load imbalances. As used herein, an “aggregate” is a logical container for a pool of storage which combines one or more physical mass storage devices (e.g., disks) or parts thereof into a single logical storage object, which contains or provides storage for one or more other logical data sets at a higher level of abstraction (e.g., volumes). A “data block”, as the term is used herein, is a contiguous set of data of a known length starting at a particular offset value.
Furthermore, storage devices with different response times or different ratios of access rate and capacity can create load imbalances. Storage devices with a larger capacity have more data stored on them than storage devices with a smaller capacity, and thus are accessed more frequently. For example, Advanced Technology Attachment (ATA) disks often hold more data and operate more slowly than other disks (e.g., Fibre Channel (FC) disks).
Some existing solutions attempt to resolve load imbalance by providing hierarchical storage in which storage devices are arranged according to their inherent characteristics, such as their access time. The access time refers to the amount of time it takes to locate and provide data from the media in response to a request. Access time reflects location of the media (e.g., local or remote) as well as time it takes to find the correct position of data on the individual media. Typically, tape drives have longer access times than disk drives and are more often used for archival storage located remotely from primary storage of disk drives. One such exemplary hierarchical system is described in U.S. Pat. No. 6,032,224 (referred to herein as '224 patent), assigned to EMC Corporation of Hopkinton, Mass. In such a system, storage devices are arranged based on their inherent access time so that the storage devices that have a low access time (such as disk drives) are arranged at the top of the hierarchy and storage devices that have a high access time (such as tape drives) are arranged at the bottom of the hierarchy. Among storage devices described in '224 patent that are arranged hierarchically are a solid state disk, a local disk, a network disk, an optical disk, a network optical disk, a local tape, and a network tape. With such a hierarchical arrangement, the '224 patent describes a system for monitoring a rate of access specifying a number of accesses or usage per unit of time of individual blocks of storage and moving more frequently used blocks to storage devices nearer the top of the hierarchy (such as disk drives) having inherently low access time and less often accessed blocks to storage devices nearer the bottom of the hierarchy (such as tape drives) with inherently high access time. Accordingly, hierarchical storage systems are generally directed towards providing overall system response improvements between primary, secondary, and lower levels of storage.
Hierarchical storage devices attempt to serve as many requests as possible from the fastest level of the hierarchy. However, segregating storage devices into “fast” and “slow” creates load imbalances. In such a hierarchy, requests are served from the top of the hierarchy (fast storage devices) and the bottom of the hierarchy feeds data to the top. This explicit concentration of requests into a subset of the total collection of storage devices creates load imbalance.
Accordingly, what is needed is a mechanism that reduces or eliminates load imbalance in the collection of storage devices without limitations imposed by prior art systems.