A distributed system generally includes many loosely coupled computers, each of which typically include a computing resource (e.g., processor(s)) and storage resources (e.g., memory, flash memory, and/or disks). A distributed storage system overlays a storage abstraction (e.g., key/value store or file system) on the storage resources of a distributed system. In the distributed storage system, a server process running on one computer can export that computer's storage resources to client processes running on other computers. Remote procedure calls (RPC) may transfer data from server processes to client processes.
A remote procedure call is a two-sided software operation initiated by client software executing on a first machine and serviced by server software executing on a second machine. Servicing storage system requests (e.g., read data) in software may require an available processor, which may place a significant limitation on a distributed storage system. In the case of a distributed storage system, this means a client process cannot access a remote computer's storage resources unless the remote computer has an available processor to service the client's request. Moreover, the demand for processor resources and storage resources in a distributed system often do not match. In particular, computing resource (i.e., processors) may have heavy and/or unpredictable usage patterns, while storage resources may have light and very predictable usage patterns.
Typical logic implementing a distributed cache system can be divided between client and server jobs. Server jobs placed on machines across a cluster respond to remote procedure calls (RPC) from clients instructing the server jobs to store or retrieve cache data on the corresponding machines on which the jobs reside. The server jobs may require access not only to low latency storage capacity but also to computation time of a central processing unit (CPU). The CPU time is required to process RPCs, compute cache placement policies (mappings from cache blocks to local storage addresses), manage cache eviction policies (in order to manage limited cache storage space), and provide concurrency control amongst many concurrent requests (server jobs are often multi-threaded in order to provide low latency and high throughput).
This coupling of storage and computation requirements for cache server jobs can cause low utilization and/or high latency in general-purpose computing clusters (where server jobs are co-located with other jobs on the cluster's nodes), which is counter to the very purpose of the distributed cache. For example, nodes in the cluster with unused random access memory (RAM) that could be used by cache servers may not have any spare CPU cycles with which to serve cache requests. In this case, the “stranded” RAM goes unused. On the other hand, nodes with spare CPU cycles can experience contention for those cycles, which leads to high latencies for the cache server's RPCs. In this case, the only remedy may be to accept the high latencies or run the nodes at lower CPU utilization.