This section is intended to introduce the reader to various aspects of the art that may be related to various aspects of the present invention. The following discussion is intended to provide information to facilitate a better understanding of the present invention. Accordingly, it should be understood that statements in the following discussion are to be read in this light, and not as admissions of prior art.
Caching appliances make one or more copies of the data stored on a Network-attached storage (or NAS) file server, based upon the data being recently accessed, with the goal of accelerating access to that data, as well as reducing the load placed on the server by the NAS clients. But caching appliances, in their current implementation are limited both in the types of operations that they can offload from the back-end file servers, and in their scalability in a data center environment. Scalability is especially important in those environments where applications can run on any physical system within a large cloud of compute servers, each acting as a NAS client, and thus where there may be thousands, or more, clients communicating with a small number of NAS file servers. This invention addresses these scaling and performance issues, as described below.
There have been a number of network file system caching devices released over the years. The first release of the Andrew file system, in 1984, performed disk-based caching, and NFS clients 20, from their first days, contained memory resident caches. The NFS/AFS translator, a product from IBM Transarc Labs, supported caching of NFS files stored in an AFS global file system. All of these systems made local copies of data stored on a back-end file server (an AFS server in the NFS/AFS translator example), and service incoming NFS requests both with the assistance of the cached data and making requests to the back-end file server. File system caches are partially categorized by how they process write operations. They operate either in write-through mode, where every incoming write operation is forwarded back to the back-end file server before being acknowledged, or in write-back mode, where incoming write operations may be acknowledged by the cache appliance before the data is actually written to the back-end file server. Write-through caches are simpler, since they use simpler techniques to ensure that all caches see the latest written data, and to ensure that in the event of multiple crashes, no acknowledged data is ever discarded.
All of the systems discussed above perform write-through caching, to ensure NFS's “close to open” semantics are met, guaranteeing that a file open, performed after a program writing a file closes its file, will see the most recently written data. These systems also aggressively write data through to the server to ensure data persistence in the event of a crash of the caching system.
Gear6 provides a pure memory cache with a global directory maintaining a single copy of each piece of cached data in one of the cluster appliances, but it, too, writes data back to the back-end file server aggressively, partially to ensure persistence. Gear6 appliances also verify that cached data is up-to-date on many references because Gear6 recommends that write-heavy loads go directly to the back-end filer to improve overall system performance.