1. Field of the Invention
The present invention relates generally to cache management in computing systems, and more particularly, to maintaining cache coherency in a computing environment such as a distributed computing environment.
2. Description of the Prior Art
In computing systems, a cache is a memory system or subsystem which transparently stores data so that future requests for that data can be served faster. As an example, many modern microprocessors incorporate an instruction cache holding a number of instructions; when the microprocessor executes a program loop where the same set of instructions are executed repeatedly, these instructions are fetched from the instruction cache, rather than from an external memory device at a performance penalty of an order of magnitude or more.
Similarly, most modern operating systems cache disk objects for re-use. When an application program reads a file from a disk, the file contents are stored as one or more objects in a cache memory so that a subsequent request for that file can be fulfilled from the cache, which will be much faster than reading the file once more from the disk. It is to be understood herein that an object is a disk file, a portion of a disk file, a data object stored on an object storage system, or other contents of a storage device, a disk drive or other storage media.
As is known in the art, a cache manager, for example for disk operations, intercepts input/output (I/O) requests to the disk. When the cache manager receives a read request from a client such as an application program, the cache manager determines if the required object is stored in the cache, and returns that object to the client if it is stored in the cache; this is known as a cache hit. If the required object is not stored in the cache, this is known as a cache miss; the object is read from the disk, the cache manager places a copy of the object in the cache, and returns the object to the client.
Issues arise with multiple applications making use of the same data, and with multiple distributed caches serving multiple applications. The issue is known as cache coherency, and deals with the fundamental question regarding whether a given copy of an object in a cache is valid. If the copy of the object in the cache is the same as the object on the storage device, then the copy of the object in the cache is valid and can be used. Conversely, if the object on the storage device has been modified since the copy of the object in the cache was made or stored, then the copy of the object in the cache is not valid and cannot be used; it is considered stale. The first issue then is identifying that an object in the cache is stale. When a stale cache object is identified, the second issue is that the object must be retrieved again from the storage device at a substantial performance penalty, compared to retrieval from the cache.
In other environments, such as where a computing system hosts multiple virtual machines under the control of a hypervisor, with each virtual machine running one or more programs, caching of objects stored on a network attached storage system can provide significant performance improvements. This presumes, however, that cache coherency issues can scale with the workload.
These cache coherency issues increase in a distributed computing environment, with multiple computing systems, each hosting multiple virtual machines, each seeking to cache content from a shared network attached storage system. In such a distributed computing environment, without caching, the performance of the shared network attached storage system becomes a performance bottleneck. Caching can alleviate this bottleneck, but using a centralized cache management strategy, in which all cache accesses must be checked by a centralized cache manager, replaces one bottleneck with another, and introduces another single point of failure into the distributed environment.
What is needed is a method to manage cache coherency in a computing environment and particularly in a manner which can scale with the requirements of a distributed computing environment, without introducing performance bottlenecks.