Modern applications are shifting from compute-centric to data-centric arrangements as they require continuous access to persistent data rather than computing from memory-resident datasets. In addition, the trend towards application consolidation in data centers leads to running demanding workloads on modern servers with diverse I/O requirements. Furthermore, cost and energy efficiency constraints dictate building larger servers with many cores and I/O resources that host multiple applications. Until recently, the I/O subsystem in modern servers has been limited by the devices themselves. Today, with the availability of NAND-Flash based solid-state disks (SSDs), high TOPS are becoming economically viable for typical data center servers and flash is becoming a necessary layer in the storage hierarchy.
These trends have created a landscape with significant new challenges for the storage I/O path in modern servers. Independent workloads interfere with each other when competing for shared resources and they experience both performance degradation and a large variance. Competition for space in the DRAM I/O cache results in more accesses to the underlying tiers of the storage hierarchy. Mixed devices access patterns result in device performance well below nominal device capabilities. Shared buffers and structures in the I/O path results in increased contention, even for independent workloads accessing shared resources.
On today's servers, data-centric applications can experience severe performance degradation when ran concurrently with other applications. FIG. 1 illustrates this point for a server running multiple virtual machines (VMs), where each VM hosts a workload that performs some level of storage I/O. Production VM refers to the workload instance for which we are interested in improving its performance or maintaining acceptable performance. Noise VM refers to the workload instance that accesses a separate data-set and may interfere with the production VM. Three distinct I/O load profiles are considered, marked with low, mid, high. As shown, the performance degradation for the production VM when ran concurrently with another VM can be as high as 40× for a transaction processing workload, TPC-E, when looking at the transaction rate. For another transaction processing workload, TPC-W, the performance degradation can be as high as three orders of magnitude over the nominal case, when looking at the average transaction latency. It is important to note that the system where these workloads are running has adequate resources to support both the production and interfering VMs. However, the capacity available to the production VM is variable over time, on a unregulated system, resulting in severe performance degradation.
Improving I/O performance has been a problem tackled in many prior art works, and in particular in the storage I/O path across different workloads. This problem is exacerbated when the OS or hypervisor manages large amounts of I/O resources as a single pool, concurrently accessed by all workloads via a shared I/O path. For example, in Linux systems, Control groups (cgroups) is currently the only mechanism in the Linux kernel to limit resource usage per process group. cgroups allow users to specify resource limits per process group, for numerous resources in the Linux kernel, including cache sizes and bandwidth in the I/O path. cgroups still treat the I/O path as a single-shared entity and lead applications to compete, e.g. for buffers, locks, structures, and global allocation policies.
Other prior art has thus far focused on proportional fair-share allocation of resources, on relatively long timescales. Although these prior art arrangements limit the use, allocation, or scheduling of resources, applications still access each resource in a single, system-wide, shared pool via shared structures and metadata. For instance, there is still a single DRAM I/O cache in all cases, and usually shared devices.
These prior art problems are becoming more and more pertinent as the performance of storage improves with current technology trends. However, most prior art solutions have focused on improving upon the mechanisms used for synchronizing storage resource access. This results in the bottleneck being shifted from particular storage devices, or other physical resources, to the single I/O path from the storage resources to the kernel or user space in a computer system.
Accordingly, there is a need in the art for improving storage I/O path throughput or performance, especially in the presence of multiple workloads.