A traditional storage array (herein also referred to as a “disk storage array”, “disk array”, or simply “array”) is a collection of hard disk drives operating together logically as a unified storage device. Storage arrays are designed to store large quantities of data. Storage arrays typically include one or more storage array processors (SPs) for processing input/output (I/O) requests and management-type requests. Data storage resource allocation requests are generally generated from internal requests (i.e., are not received externally to the data storage array). An SP is the controller for, and primary interface to, the storage array.
Storage systems may include one or more disk arrays. Disk arrays may use a variety of storage devices with various characteristics for providing storage. Each storage array may logically operate as a unified storage device. While such organization generally allows for a homogenous view of the storage devices, it is sometime useful to organize the various storage devices into tiers or classes of storage. A tier is generally delineated by differences in at least one of the four following attributes: price, performance, capacity and function. For example, tier 1 storage devices may be comprised of storage media that is very reliable and very fast, such as flash memory. Tier 2 storage devices may be comprised of storage media that are slower than tier 1 media but are very reliable (e.g., a hard disk). For example, tier 2 storage devices may include high performance disks such as 15,000 RPM Fibre Channel (FC) Disks. Tier 3 storage devices may be comprised of comparatively slower and cheaper storage media than either tier 1 or tier 2, such as 7200 RPM serial ATA (SATA) disks.
Performance of a storage array may be characterized by the array's total capacity, response time, and throughput. The capacity of a storage array is the maximum total amount of data that can be stored on the array. The response time of an array is the amount of time that it takes to read data from or write data to the array. The throughput of an array is a measure of the amount of data that can be transferred into or out of (i.e., written to or read from) the array over a given period of time.
The administrator of a storage array may desire to operate the array in a manner that maximizes throughput and minimizes response time. In general, performance of a storage array may be constrained by both physical and temporal constraints. Examples of physical constraints include bus occupancy and availability, excessive disk arm movement, and uneven distribution of load across disks. Examples of temporal constraints include bus bandwidth, bus speed, spindle rotational speed, serial versus parallel access to multiple read/write heads, and the size of data transfer buffers.
One factor that may limit the performance of a storage array is the performance of each individual storage component. For example, the read access time of a disk storage array is constrained by the access time of the disk drive from which the data is being read. Read access time may be affected by physical characteristics of the disk drive, such as the number of revolutions per minute of the spindle: the faster the spin, the less time it takes for the sector being read to come around to the read/write head. The placement of the data on the platter also affects access time, because it takes time for the arm to move to, detect, and properly orient itself over the proper track (or cylinder, for multihead/multiplatter drives). Reducing the read/write arm swing reduces the access time. Finally, the type of drive interface may have a significant impact on overall disk array storage. For example, a multihead drive that supports reads or writes on all heads in parallel will have a much greater throughput than a multihead drive that allows only one head at a time to read or write data.
Furthermore, even if a disk storage array uses the fastest disks available, the performance of the array may be unnecessarily limited if only one of those disks may be accessed at a time. In other words, performance of a storage array, whether it is an array of disks, tapes, flash drives, or other storage entities, may also be limited by system constraints, such the number of data transfer buses available in the system and the density of traffic on each bus.
Thus, to maximize performance of a storage array, the operational load should be more or less evenly distributed across all physical resources, so that each physical resource may operate at its own maximum capacity. Using a disk storage array as an example, bandwidth, and thus performance, is maximized if “all spindles are being accessed at the same time.”
Performance of a storage array may also be characterized by the total power consumption of the array. The administrator of a storage array may prefer to operate the array in a manner that minimizes power consumption (“green” mode) rather than maximizes performance (“brown” mode). Operating a large storage array in green mode may not only reduce power consumption of the array itself and its associated costs but also may have indirect benefits associated with the reduction of heat being generated by the array. For example, storage arrays typically are housed in an environmentally-controlled room or site; operating an array in green mode may reduce the heat that the air conditioning system must remove, thus lowering the cost to run the site HVAC system. Furthermore, semiconductor devices age faster in hot environments than in cold environments; a storage device, whether it is a hard disk drive, flash drive, or other, will age faster if it is mounted in a rack such that it is surrounded by other heat-generating storage devices than if it is in the same rack but surrounded by cool (e.g., idle) storage devices. Thus, operating a storage array in green mode may increase the mean time between failure for the devices in the array.
Separate from but intimately related to performance maximization is the problem of underuse of scarce physical resources. Storage arrays are typically used to provide storage space for one or more computer file systems, databases, applications, and the like. For this and other reasons, it is common for storage arrays to be logically partitioned into chunks of storage space, called logical units, or LUs. This allows a unified storage array to appear as a collection of separate file systems, network drives, and/or volumes.
The problem of underuse arises when, for example, an amount of storage space is allocated to, but not used by, an operating system, program, process, or user. In this scenario, the scarce (and probably expensive) resource—disk storage space, for example—is unused by the entity that requested its allocation and thus unavailable for use by any other entity. In many cases, the unused space cannot be simply given back. For example, a database installation may require many terabytes of storage over the long term even though only a small fraction of that space may be needed when the database is first placed into operation. In short, it is often the case that the large storage space will be eventually needed, but it is not known exactly when the entire space will be needed. In the meantime, the space lies unused and unavailable for any other use as well.
Recognizing that more storage space may be provisioned for operating systems, programs, and users than they may actually use at first, the concept of a sparsely populated or “thin” logical unit (TLU) was developed. Unlike the more traditional “fat” or fully allocated logical unit (FLU), which is created by provisioning and allocating a certain amount of storage area, a TLU is provisioned at creation but is not allocated any physical storage until the storage is actually needed. For example, physical storage space may be allocated to the TLU upon receipt of an I/O write request from a requesting entity, referred to herein as a “host”. Upon receipt of the write request from the host, the SP may then determine whether there is enough space already allocated to the TLU to store the data being written, and if not, allocate to the TLU additional storage space.
While thin logical units provide distinct advantages over fully allocated logical units (i.e., where the entire storage space requested is actually allocated and reserved for the exclusive use of the requesting entity), the manner in which the data storage resource (e.g., slices) are allocated across physical disks can have an enormous impact on the performance of the storage array. A slice is a portion of a logical partition of data stored on a physical disk device.
A naïve approach to allocation of storage for sparsely populated logical units, i.e., one that does not take into consideration the underlying physical and temporal constraints of the storage array in general and of the FLU pool in particular, may fail to meet the goals of the policy, such as green or brown for example, chosen by the administrator of the storage array. For example, if the administrator desires to maximize performance—i.e., a brown policy—a storage processor using a naïve allocation method might allocate all of the slices from a single physical disk, in which case the performance of the entire array may be needlessly constrained by the single disk and thus fail to meet the performance goals of the brown policy.
Systems that manage large numbers or amounts of resources often must impose organizational structures onto the collection of resources in order to manage the collection in a rational way. Preferably, the organization is along natural boundaries that consider real, physical characteristics and constraints of the collection and its individual components. The difficulties of managing large and complex collections of resources may be mitigated via the use of high level abstractions to represent, in simplified form, certain aspects of the system, the collections or resources, and the organization imposed thereon.
A large data storage array is an illustrative example. A traditional storage array (herein also referred to as a “disk storage array”, “disk array”, or simply “array”) is a collection of storage entities, such as hard disk drives, solid state drives, tape drives, or other entities used to store information (for simplicity, hereinafter referred to as “disks”), operating together logically as a unified storage device. A storage array may be thought of as a system for managing a large amount of a resource, i.e., a large number of disk sectors.
Management of the resource may include allocation of a portion the resource in response to allocation requests. In the storage array example, portions of the storage array may be allocated to, i.e., exclusively used by, entities that request such allocation. One issue that may be considered during allocation of a resource is the selection process—namely, how to determine which unallocated portion of the collection of resources is to be allocated to the requesting entity.
Conventionally, all resources of the same type are treated the same because it was assumed that the performance of components within the data storage array performed similarly and data would be stored and accessed evenly across the array. Initially, this assumption may be valid because any performance differences between resources of the same type and any asymmetries in data usage are unknown. However, as the data storage array fills up and the stored data is accessed, some resources may be more heavily utilized than other resources of the same type and/or resources of the same type may begin to perform differently. For example, two identical 7,200 rpm disks may initially be assumed to have identical performance and share data storage and processing loads equally because the client initially stores 10 GB on each disk. However, at some later point in time, the client may either delete or rarely access the data stored on the second disk while constantly updating the files stored on the first disk. As a result, the first disk may operate with slower performance. While the client may have previously been able to observe this inefficiency, the client was unable to correct it because the client had no input or control regarding how slices were allocated or re-allocated. For example, no mechanism (e.g., slice allocation policy) currently exists for allocating a slice from a particular performance tier or other resource constraint specified by the client in a slice allocation request.
Accordingly, there exists a need for methods, systems, and computer readable medium for tier-based slice allocation and data relocation in a data storage array.