Mass storage devices within computers are essential for holding important information for each end user as well as for holding application data and low level operating system files. Although these devices have improved significantly over the years they still are one of the slowest devices in the system when accessing data. The processor and main system memory are much faster at transferring data. To increase the performance of these mass storage devices relative to the other computer system components it is quite common to have a cache associated with the device that stores information recently accessed through read and write commands. Many times a cache is beneficial to increasing the performance of the device, but depending on the location of the data on the mass storage device and the data access patterns, the cache many times is utilized inefficiently.
Caches are most useful for workloads that have a large number of data accesses to a small subset of total data. Additionally, caches are also more efficient if they are able to accumulate historical data about what data is most likely to be reused. When using a disk cache, whether comprised of volatile or non-volatile memory, it is important to consider the impact of workloads that can flush the cache of useful contents. Problematic workloads can include workloads that access a large amount of data only one time such that they displace the useful contents of the cache while deriving no benefit from caching themselves. One such potentially problematic workload is a streaming workload (e.g., video or audio playback).
In a streaming workload, a dataset that is larger than the cache size is accessed sequentially. In this case, even if the access pattern is repeated, the cache is not helpful because the first part of the stream will be evicted by the last part of the stream. Thus, even though the data is inserted in the cache, it is no longer present in the cache by the time it is accessed again. Modern operating systems such as Microsoft® Windows® may have certain disk caching policies for performance improvement based on file types. Thus, a large movie file may not be cached during a stream because the content is played one time only.