Virtualized computing environments provide tremendous efficiency and flexibility for system operators by enabling computing resources to be deployed and managed as needed to accommodate specific applications and capacity requirements. As virtualized computing environments mature and achieve broad market acceptance, demand continues for increased performance of virtual machines (VMs) and increased overall system efficiency. A typical virtualized computing environment includes one or more host computers, one or more storage systems, and one or more networking systems configured to couple the host computers to each other, the storage systems, and a management server. A given host computer may execute a set of VMs, each typically configured to store and retrieve file system data within a corresponding storage system. Relatively slow access latencies associated with mechanical hard disk drives comprising the storage system give rise to a major bottleneck in file system performance, reducing overall system performance.
One approach for improving system performance involves implementing a buffer cache for a file system running in a guest operating system (OS). The buffer cache is stored in machine memory (i.e., physical memory configured within a host computer; also referred to as random access memory or RAM) and is therefore limited in size compared to the overall storage capacity available within the storage systems. While machine memory provides a significant performance advantage over the storage systems, this size limitation has a net effect of reducing system performance because certain units of storage may be evicted from the buffer cache prior to a subsequent access. Once a unit of storage is evicted from the buffer cache, accessing that same unit of storage again typically requires an additional low performance access to the corresponding storage system.
A RAM-based file system buffer cache is a common way of improving the input/output (IO) performance in guest operating systems, but the improvement of the performance is limited by the high price and limited density of RAM memory. As flash-based solid state drives (SSDs) emerge as new storage media with much higher input/output operations per second than hard disk drives and lower price than RAM, they are being broadly deployed as a second-level read cache in virtualized computing environments, e.g., in the virtualization software known as a hypervisor. As a result, multiple levels of caching layers are formed in the storage IO stack of the virtualized computing environment.
In virtualized computing environments, the different caching layers as described above typically and purposefully do not communicate, rendering conventional techniques for achieving cache exclusivity inapplicable. As such, the second-level cache typically does provide improved read performance, but identical data may be cached in both the buffer cache and the second-level cache, reducing overall effective cache size and cost-efficiency. Although conventional caching techniques do improve read performance, adding an additional layer of caching in a virtualized environment results in a decrease in storage utilization efficiency due to redundant storage of cached data.