A typical data storage system includes a cache device (herein simply referred to as a cache) that stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere. If the requested data is contained in the cache (herein referred to as a cache hit), this request can be served by simply reading the cache, which is comparatively faster. On the other hand, if the requested data is not contained in the cache (herein referred to as a cache miss), the data has to be recomputed or fetched from its original storage location, which is comparatively slower. Hence, the greater the number of requests that can be served from the cache, the faster the overall system performance becomes.
In some systems, items are accessed and replaced within a cache at the same granularity. In others, hardware features may require these to be done at a different granularity. In particular, types of solid-state drives (SSDs) or other forms of persistent memory may require that data be “erased” at a granularity larger than reads and writes can be performed. SSDs have a limited endurance, i.e. a given region of an SSD can only be erased a limited number of times before performance degradation or write errors ensue. It is therefore beneficial to limit the number of erasures to an SSD.
As used herein, a cache comprises segments (e.g., 16 kilobytes (KB) in size) that are grouped at a larger granularity referred to as a cache unit (e.g., 1 megabyte (MB) in size). Segments may be fixed or variable-sized. As used herein, a “segment” is data unit of each cache access, and a “cache unit” is a data unit of cached data that are evicted by the cache manager at the same time. Once a cache unit is evicted, it can be erased and reused. When a cache client inserts a segment (e.g., writes data to the system), a cache manager packs it within a partially-full cache unit. When the cache unit is full, the cache manager writes it to the system (e.g., a storage device) and updates the index to indicate where the data is stored in the storage device. When a cache client requests a segment using a key, such as a <filehandle and offset> or a segment fingerprint such as a SHA1 hash, and the requested segment already exists in the cache, the cache manager provides the cached data to the client. If the requested data does not exist in the cache, the cache manager uses a corresponding index to determine the segment location in the storage device. The cache manager then fetches the segment from the storage device, and provides it to the client. In either case, the cache manager then may update a timestamp to indicate when the segment/cache unit was accessed.
In order for a cache to be effective at reducing system latency, it must contain the hottest (i.e., most relevant) data. In this regard, when a cache is full, colder (i.e., less relevant) data stored in the cache must be evicted to make room for new data. Conventionally, recency information is used to support a least-recently-used (LRU) eviction policy (or related policies). Under such a policy, the LRU cache unit is evicted. Such a simplistic approach, however, is problematic for coarse-granularity cache eviction. For example, if a cache unit has one hot segment, and the remaining segments are cold, under the conventional approach, the cache unit is not evicted even though it contains mostly irrelevant data.