Server Flash Cache (SFC) is a technology that allows server systems to use flash storage as a cache to accelerate virtual machine (VM) I/O operations. Several SFC implementations support a feature known as write-behind caching. When a server system enables SFC write-behind caching, the server system intercepts VM write requests directed to virtual disks stored in a backend storage device (e.g., a hard disk-based array), caches the data associated with the write requests in a flash storage-based cache (i.e., “flash cache”), and immediately returns acknowledgements to the originating VMs indicating successful write completion. Upon receiving the acknowledgements, the VMs continue their processing. At a later point in time, the server system flushes the data from the flash cache to the backend storage device, thereby completing the actual write process. Since the VMs can proceed with their processing as soon as the server system caches the data in flash storage (rather than waiting for the server system to write the data to slower hard disk-based storage), this feature can significantly improve VM write performance.
To carry out write-behind caching in an efficient manner, the server system generally maintains, in volatile memory (e.g., RAM), cache metadata that keeps track of which pages in the flash cache are “dirty” (i.e., include unflushed write updates) and how those dirty pages map to target locations on disk. When the server system is ready to flush the flash cache to the backend storage device, the server system accesses the in-memory cache metadata to determine what data needs to be flushed and where the data should be written.
One issue with maintaining cache metadata in volatile memory as noted above is that the cache metadata is non-persistent across system crashes and other events that cause a system shutdown or power cycle. The unexpected loss of this cache metadata due to such an event can potentially leave the server system and backend storage device in an inconsistent state. For example, consider a scenario where the server system crashes after it has cached and acknowledged a VM write request, but before it has flushed the data associated with the write request from the flash cache to the backend storage device. Upon recovering from the crash, the server system no longer has access to the cache metadata, and thus cannot flush the data from the flash cache. As a result, the data is effectively “lost,” since the server system is unable to propagate it to persistent storage. At the same time, the VM that originated the write request assumes (due to the acknowledgement it received prior to the crash) that the data is stored in the virtual disk resident on the backend storage device, when in fact it is not. This inconsistency can lead to unpredictable errors and other difficult-to-resolve issues.