Enterprises are increasingly out-sourcing the management of their data and applications to managed hosting services that collocate multiple sites, applications, and multiple customer types on the same host machine or a cluster and provide different quality-of-service (QoS) to them based on various pricing options. Due to the higher achievable utilization of resources and the benefits of centralized management, enterprises are also increasingly consolidating resources and sharing them across applications and users. Sharing and aggregation in storage subsystems is supported by network attached storage servers (NAS) and storage area networks (SAN) that allow access to storage devices from multiple servers.
One of the central challenges in shared environments is to manage resources such that applications and customers are isolated from each other and their performance can be guaranteed as in a dedicated environment. Numerous mechanisms for service differentiation and performance isolation have been proposed in the literature in the context of collocated web servers. Such mechanisms include QoS-aware extensions for admission control, TCP SYN oackets, policing and request classification, accept queue scheduling, and CPU, network, and disk bandwidth scheduling. However, the management of storage resources that takes into consideration workload classes having different QoS requirements has largely remained unaddressed.
The storage system in a shared hosting environment maybe a NAS server (supporting NFS/CIFS based network file access), a block server such as a SAN-based block virtualization engine, or an enterprise storage system such as IBM TotalStorage and EMC Symmetrix. To provide QoS in these storage systems, resources such as CPU and cache at the NAS and block servers, SAN network bandwidth, and disk bandwidth have to be managed. Techniques for allocating CPU, network, and disk bandwidth have been investigated in the literature for web servers and other applications and can potentially be applied to storage systems. However, techniques for allocating cache to provide QoS differentiation have not been adequately investigated and are the focus of this invention.
Caches differ fundamentally from other resources such as CPU and network bandwidth in two aspects. First, if CPU (or network bandwidth) is allocated to a workload class, it can be immediately used to improve the performance for that class. In contrast, the allocation of cache space does not yield immediate performance benefits for a workload class; performance benefits accrue in future only if there are cache hits. Furthermore, unlike a CPU, current cache space allocation for a class can significantly impact the future cache performance of all other classes. With ephemeral resources such as CPU and network bandwidth, adaptation is faster with immediate reallocation, while with cache allocation any adaptation technique requires a window into the future. Second, the performance benefit of cache space allocation depends on the workload characteristics of the class. More cache space does not necessarily imply better performance (e.g., if the workload has no hits). Due to these fundamental differences, techniques for meeting the quality-of-service (QoS) requirements of multiple classes that have been developed for resources such as CPU and network bandwidth cannot be directly applied to caches.
One approach for cache allocation is to statically partition the cache among different workload classes. FIG. 1 is a flow chart showing an example of prior art methods for statically allocating the cache. At step 10, the prior art method examines the QoS requirements of the workload classes offline. It then computes the cache size for each workload class based on its QoS requirements at step 11. At step 12, the method allocates the computed cache space to each workload class. As the workloads in a class change, the method would have to recompute the cache size offline at step 13 and allocate the cache space again at step 12. Such an approach in the prior art has two main drawbacks. First, due to the dynamic nature of the workload, it is difficult to determine the appropriate partition size apriori. Second, static partitioning leads to inefficient use of the cache. For example, if at some point in time, if one class has low locality and the other has high locality, then static allocation will underutilize the cache.
Therefore, there remains a need for a storage system and method for efficiently and dynamically allocating cache space among multiple classes of workloads in the system where each class has unique set of quality-of-service requirements to maximize system performance.