Server Flash Cache (SFC) is a technology that allows host systems to leverage flash storage to accelerate virtual machine (VM) I/O operations. Generally speaking, an SFC-enabled host system includes a host-side flash storage device (e.g., a solid state disk (SSD), a PCIe flash card, etc.) and a hypervisor-resident caching module. The caching module intercepts I/O requests from VMs running on the host system and caches, in a portion of the host-side flash storage device referred to as a “flash cache,” data associated with the I/O requests that the host system reads from, or writes to, a backend storage array (e.g., a hard disk-based array). In addition, upon intercepting a read request, the caching module determines whether the data associated with the read request is already available in the flash cache. If so, the caching module services the read request from the flash cache rather than the backend storage array. Since the I/O latency for flash storage access is typically several orders of magnitude less than the I/O latency for hard disk access, this caching mechanism can significantly improve VM I/O performance.
In certain instances, an application running within a VM may have access to contextual information regarding the I/O requests it issues that can assist the caching module in managing the flash cache. For example, the application may know that it will issue a read request for a particular data block several times over a short timespan, which suggests that the caching module should keep that data block in the flash cache to service the multiple requests. As another example, the application may know that it will issue a read request for a particular data block only once, which suggests that the caching module should deprioritize that data block or avoid caching it altogether. Unfortunately, with current SFC implementations, there is no way to communicate such contextual information (referred to as “I/O hints” or “hints”) from the VM-level application to the hypervisor-level caching module. Thus, the caching module can only make caching decisions based on the observed I/O requests themselves, which may result in sub-optimal flash cache usage/management.