Host computers allocate portions of flash memory to virtual machines to serve as an input/output cache between the virtual machine and an underlying storage device. For example, an administrator may allocate a portion of the total amount of flash memory to a virtual machine that is equal to a percentage of the size of the virtual machine's virtual disk drive. As a result, data for the virtual machine may be fetched from the host computer's flash memory rather than by accessing the underlying storage device. The amount of benefit resulting from caching data blocks in flash memory, however, is dependent upon the workload and, in particular, the data reuse pattern of the workload. For example, allocating a large portion of flash to a virtual machine with a streaming workload (i.e. a workload with no data reuse) will result in little to no benefit. Additionally, there may be one or more other virtual machine workloads running on the host computer with a greater amount of data reuse that would better utilize that portion of flash memory.
Conventional algorithms used to determine data reuse in workloads consume large amounts of processing and memory resources. As a result, these algorithms are typically utilized for offline workload analysis. Given that the data reuse pattern for workloads can vary over time, it is challenging for administrators to specify flash allocations based upon data reuse in a given workload. This problem is exacerbated when the administrator is responsible for allocating flash in multiple host computers, each running thousands of virtual machines.