1. Field of the Invention
This invention relates to data storage systems. Particularly, this invention relates to data storage systems using performance-based volume allocation using storage pools under varying degrees of control.
2. Description of the Related Art
Performance based volume allocation is an important aspect of storage provisioning for workloads. Modern storage controllers have several components within, such as arrays, ranks, device adapters, host adapters, and processor complexes, each with a specific set of capacities. When introducing a workload, one of the tasks a storage administrator confronts is to decide where to allocate the necessary storage volumes for that workload to meet its performance requirements while taking into account the existing performance load measures at the various storage components.
Currently, such performance-based volume allocation is performed manually based on the knowledge of expert storage consultants. Accordingly, such performance based allocation is expensive, error-prone and sometimes yields inferior solutions for the customers depending upon the level of expertise and how many alternatives a consultant considers. Some known projects, such as Ergastulum, and Hippodrome from Hewlett-Packard labs look at how to design a cost-effective storage system given the usage requirements, but they do not address the question of where to allocate new volumes based on real-time performance measures for a given storage system. Other products may address volume allocation based mostly on available space bounds. However, they do not consider performance bottlenecks that are likely to happen in various components of a subsystem due to the allocation. This may be because such performance information from the various components was not previously available in a consolidated location until very recently. Therefore, more often than not, the conventional approach has been to overprovision the storage area network (SAN) and tolerate the resulting underutilization.
Currently, with newer storage management products making improvements at gathering real-time performance information from the various components, a foundation is being laid, upon which performance-based volume allocation may be built.
Different storage controllers have some key differences from one another (e.g., IBM ESS Shark, IBM D8000 Megamouth, EMC Symmetrix, HP ones etc). Besides their structural and performance variations, a key difference among them, with respect to volume allocation, is in the concept of “storage pool”. A storage pool is a logical collection of “ranks”. A rank is the basic unit of capacity in a storage system. (Each rank may comprise one or more volumes for allocation.) In some controllers, such as IBM ESS Shark, a single pool comprises a single rank so the mapping is one-to-one. On the other hand, in other controllers, such as IBM DS8000 Megamouth, a single pool can comprise multiple ranks so the mapping is one-to-many. And within the one-to-many type, there are varying degrees of control that are exposed in terms of where a new volume can be created during the course of a performance-based volume allocation. These can roughly be classified as either “full control”, “no control but can infer”, and “no control and cannot infer”. A “full control” controller is simply defined as allowing the user to fully control which rank the new volume will go to (e.g., IBM ESS Shark). The other two categories which both allow no control require further explanation.
A controller that is inferable (without control) allows the user to only specify which pool the volume will go to, but not which rank within that pool. The controller then uses an internal allocation strategy to pick one or more of the ranks within that pool and creates the volume on those ranks. However, the internal allocation strategy and any internal state information employed are exposed to the user so the user can infer which ranks the controller will select within a given pool.
Finally, a controller that is not inferable (without control) allows the user to only specify which pool the volume will go to, but not which rank within that pool. The controller then uses an internal allocation strategy to pick one or more of the ranks within that pool and creates the volume on those ranks. However, its internal allocation strategy or some of the internal state information that the controller uses (for example a cursor) are not exposed to the user and hence the user cannot even infer which ranks the controller will pick within a given pool. It could go to any of the ranks in that pool and the user needs to account for that accordingly. (e.g., IBM DS8000 Megamouth, IBM DS6000)
The two controller types without control, and in general, differing degrees of control make the task of volume allocation decisions challenging. The challenges in volume allocation due to these differing degrees of control are four fold.
First, it can be difficult to deal with variances within a pool. The different ranks in a pool can vary significantly in terms of their current utilization levels (iops/sec, bytes/sec) based on which data they are storing and how active their workloads are. Thus, a simple averaging or aggregation of all ranks in a pool cannot capture this.
Second, there can also be space-performance mismatches within a pool. The ranks with space availability may not be the ones which have good performance availability. For example, a rank which is at 10% utilization in terms of iops/sec and bytes/sec may not have any space left but a rank which is at approximately 90% utilization in terms of iops/sec and bytes/sec may have 50 units of space left. Such imbalances are quite likely since workloads have varying activity rates and it could be that the workload on the first rank takes up a lot of space but much infrequent activity. So a naïve approach of separately aggregating all space and performance availability in a pool will not suffice to address this issue.
Third, it may be difficult to accommodate a change in the degree of control in the future. The controllers themselves are evolving through multiple software and hardware releases and it is likely that some of them may move from one degree of control to another over time. For example, for IBM Megamouth there are feature requests open to make it a “full control” or at least a “no control but can infer” mode. When such a change happens, redoing and re-architecting its VPA code will be a significant cost both in terms of new software development but also in terms of testing and transition management.
Fourth, a heterogeneous mixture of controllers in customer environments can also present difficulties. In general a customer's storage environment may have many controllers with differing levels of control. In such a scenario using entirely different algorithms or code designs for each type of control makes it hard to combine them and look at the overall solution space in a unified way. Such unification is a necessary step to enable evaluating the pool/rank combinations across all controllers simultaneously to pick the best combination.
U.S. Patent Application Publication No. 2005/0262326 A1 by Carlson et al., published on Nov. 24, 2005, discloses a method, system, and article of manufacture for borrow processing in storage pools. A plurality of physical volumes are allocated to a first storage pool. A determination is made whether the first storage pool has less than a threshold number of empty physical volumes. If the first storage pool has less than the threshold number of empty physical volumes, then at least one empty physical volume is borrowed to the first storage pool from a second storage pool.
U.S. Patent Application Publication No. 2005/0108292 A1 by Burton et al., published on May 19, 2005, discloses an apparatus for managing incremental storage that includes a storage pool management module that allocates storage volumes to a virtual volume. Also included is an incremental log corresponding to the virtual volume, which maps virtual addresses to storage addresses. The apparatus may also include a replication module that sends replicated data to the virtual volume and a policy management module that determines allocation criteria for the storage pool management module. In one embodiment, the incremental log includes a lookup table that translates read and write requests to physical addresses on storage volumes within the virtual volume. The replicated data may include incremental snapshot data corresponding to one or more primary volumes. The various embodiments of the virtual incremental storage apparatus, method, and system facilitate dynamic adjustment of the storage capacity of the virtual volume to accommodate changing amounts of storage utilization.
U.S. Patent Application Publication No. 2005/0015475 by Fujita et al., published on Jan. 20, 2005, discloses that in an environment in which storages are intensively collected, many unused areas are generated and no storage resources can be efficiently used as a storage pool. The capacity utilization (data capacity) of a storage device (volume) allocated to a computer is obtained and future capacity utilization is estimated from a change in the data capacity. Upper limit securing capacity and lower limit securing capacity showing the upper and lower limits of appropriate allocating capacity calculated from this estimated capacity utilization, and the capacity of the storage device are compared with each other. When the capacity of the storage device (old device) is greater than the upper limit securing capacity, the storage device (new device) of the lower limit securing capacity or more and the upper limit securing capacity or less is allocated from the storage pool, and the old device is collected in the storage pool.
U.S. Pat. No. 6,954,768 by Carlson et al., issued Oct. 11, 2005, discloses a method, system, and article of manufacture for pooling of storage. Volume attributes are assigned to a plurality of physical volumes. Pool attributes are assigned to a plurality of storage pools, wherein the pool attributes include policies for borrowing and returning the plurality of physical volumes to and from the plurality of storage pools. One of the plurality of physical volumes is allocated to one of the plurality of storage pools based on the volume attributes of the one of the plurality of physical volumes and the pool attributes of the one of the plurality of storage pools.
U.S. Pat. No. 6,810,462 by Matsunami et al., issued Oct. 26, 2004, discloses that a storage has NAS and SAN functions and a high degree of freedom to configure a system to reduce the management and operation cost. The storage includes a plurality of interface slots in which a plurality of interface controllers can be installed, a block I/O interface controller which has SAN functions and which can be installed in the slot, a file I/O interface controller which has NAS functions and which can be installed in the slots, a storage capacity pool including a plurality of disk devices accessible from the interface controllers, and a storage capacity pool controller to control the storage capacity pool.
Existing systems and methods do not address the problem of handling different levels of control during volume allocation, particularly in systems and methods employing performance-based volume allocation algorithms. Thus, there is a need in the art for systems and methods to accommodate varying levels of control in a storage system during volume allocation. These and other needs are met by the present invention as detailed hereafter.