Caching is a common technique in computer systems to improve performance by enabling retrieval of frequently accessed data from a higher-speed cache instead of having to retrieve it from slower memory and storage devices. Caching occurs not only at the level of the CPU itself, but also in larger systems, up to and including caching in enterprise-sized storage systems or even potentially globally distributed “cloud storage” systems. Access to cached information is faster—usually much faster—than access to the same information stored in the main memory of the computer, to say nothing of access to information stored in non-solid-state storage devices such as a hard disk.
On a larger scale, dedicated cache management systems may be used to allocate cache space among many different client systems communicating over a network with one or more servers, all sharing access to a peripheral bank of solid-state mass-storage devices. This arrangement may also be found in remote “cloud” computing environments.
Data is typically transferred between memory (or another storage device or system) and cache as cache “lines”, “blocks”, “pages”, etc., whose size may vary from architecture to architecture. Just for the sake of succinctness, all the different types of information that is cached in a given system are referred to commonly here as “data”, even if the “data” comprises instructions, addresses, etc. Transferring blocks of data at a time may mean that some of the cached data will not need to be accessed often enough to provide a benefit from caching, but this is typically more than made up for by the relative efficiency of transferring blocks as opposed to data at many individual memory locations; moreover, because data in adjacent or close-by addresses is very often needed (“spatial locality”), the inefficiency is not as great as randomly distributed addressing would cause. A common structure for each entry in the cache is to have at least three elements: a “tag” that indicates where (generally an address) the data came from in memory; the data itself; and one or more flag bits, which may indicate, for example, if the cache entry is currently valid, or has been modified.
Regardless of the number, type or structure of the cache(s), the standard operation is essentially the same: When a system hardware or software component needs to read from a location in storage (main or other memory, a peripheral storage bank, etc.), it first checks to see if a copy of that data is in any cache line(s) that includes an entry that is tagged with the corresponding location identifier, such as a memory address. If it is (a cache hit), then there is no need to expend relatively large numbers of processing cycles to fetch the information from storage; rather, the processor may read the identical data faster—typically much faster—from the cache. If the requested read location's data is not currently cached (a cache miss), or the corresponding cached entry is marked as invalid, however, then the data must be fetched from storage, whereupon it may also be cached as a new entry for subsequent retrieval from the cache.
There are two traditional methods for tagging blocks in a cache. One is to name them logically, such as using a Logical Block Address (LBA). One drawback of this method is that when a remote host asks for the block at, say, LBA 18, it is difficult to determine if the block for LBA 18 that the remote host has is current or has been overwritten with new content. This problem of ensuring consistency is especially hard in the face of failures such as a host going out of communication for a while.
The second approach is to name blocks by their storage location. Traditional systems which update data in place have the same consistency issue as with LBA-tagged arrangements. Log-structured file systems are better in this second case because new content would have been written to a new location, such that if a block stored at address X is needed and the remote host has that block, the correct data will be referenced. But if the block has been moved, however, its storage location will change and although the remote cache may have the correct data, the address will be wrong. The host will therefore reply that it does not have the data, when it actually does.
Several issues commonly arise when considering the design of a caching system. One issue is locality: Data in a local cache can be accessed more quickly than data stored in a remote system. Each host therefore typically has a local cache so that it has to do a remote fetch as infrequently as possible.
Another issue is related to granularity. If data is cached as small units, such as individual blocks, the hit rate may be higher, but this will come at the cost of so much administrative overhead that the efficiency of caching is all but lost.
Yet another issue is that caching arrangements that use a storage medium such as flash are efficient when it comes to small read operations, but function best with large writes. A large write, however, such as of a cache line, may cause an overwriting of several smaller data units in the line that were still being actively used. Some data units may thus end up being evicted from the cache even though it would be more efficient to let them remain.
What is needed is thus a caching arrangement and method of operation that uses caching technology efficiently, without increasing overhead beyond the point of diminishing return, and without too much unnecessary evictions.