A distributed system generally includes many loosely coupled computers, each of which typically includes a computing resource (e.g., one or more processors) and/or storage resources (e.g., memory, flash memory, and/or disks). A distributed storage system overlays a storage abstraction (e.g., key/value store or file system) on the storage resources of a distributed system. In the distributed storage system, a server process running on one computer can export that computer's storage resources to client processes running on other computers. Remote procedure calls (RPC) may transfer data from server processes to client processes. Alternatively, Remote Direct Memory Access (RDMA) primitives may be used to transfer data from server hardware to client processes.
Typical logic implementing a distributed cache system can be divided between client and server jobs. Server jobs placed on machines across a cluster respond to RPCs from clients instructing the server jobs to store or retrieve cache data on the corresponding machines on which the jobs reside. The server jobs may require access not only to low latency storage capacity but also to computation time of a central processing unit (CPU). The CPU time is required to process RPCs, compute cache placement policies (mappings from cache blocks to local storage addresses), manage cache eviction policies (in order to manage limited cache storage space), and provide concurrency control amongst many concurrent requests (server jobs are often multi-threaded in order to provide low latency and high throughput). This coupling of storage and computation requirements for cache server jobs can cause low utilization and/or high latency in general-purpose computing clusters (where server jobs are co-located with other jobs on the cluster's nodes), which is counter to the very purpose of the distributed cache.