Hypervisor-based flash caching is a technology that enables the hypervisor of a server system to leverage flash storage to accelerate virtual machine (VM) I/O operations. In particular, the hypervisor can store, in a portion of a flash storage device referred to as a “flash cache,” data that one or more VMs read from and/or write to virtual disks stored on, e.g., a traditional hard disk-based storage array. When the hypervisor detects a VM I/O request, the hypervisor can service the I/O request, if possible, from the flash cache rather than from the storage array. Since the I/O latency for flash storage access is typically several orders of magnitude less than the I/O latency for hard disk access, this caching mechanism can significantly improve VM I/O performance.
One of the challenges of implementing hypervisor-based flash caching in a server system that hosts multiple VMs involves managing the amount of flash cache space that is allocated to each VM (referred to as the VM's “cache allocation”). The size of this cache allocation represents the maximum amount of data that the flash storage device can hold for the VM; once this cap is reached, the hypervisor must begin evicting cached data from the VM's cache allocation in order to make room for additional data. A cache allocation size that is too small will decrease the utility of the flash cache for the VM because the hypervisor will delete a significant percentage of the VM's cached data before the VM can re-access it. On the other hand, a cache allocation size that is too large will unnecessarily consume space on the flash storage device—space that can be better utilized by being allocated to one or more other VMs.