1. Field the Invention
The present invention relates to the field of computer systems. More specifically, the present invention relates to directory caching in a distributed computer system.
2. Art Background
Operating systems spend significant time performing path name lookups to convert symbolic path names to file identifiers. In order to reduce the cost of name lookups, many systems have implemented name caching schemes.
Name lookup is an even larger problem in distributed systems, where a client machine may have to contact a file server across a network to perform the name lookup. Typically network file systems cache naming information on client workstations as well as on servers. This allows clients to perform most name lookups without contacting the server, thereby improving lookup speed by as much as an order of magnitude. In addition, client-level name caching reduces the load on the server and the network.
Distributed systems with a large number of workstations have a number of characteristics that can interfere with name caching. In a distributed environment, name caches on different machines must be kept consistent. This results in extra network messages and cost that is not required on a single time-shared system.
In a distributed environment, a very important overhead is the communication time involved in server requests. The actual operations on the server often take less time than the basic network communication. Name caching schemes typically require a separate server request for every component that is not in the name cache of the client, and typical path names contain several components. In contrast, a system without name passing can pass the entire path name to the server in a single operation (i.e., there can never be more than one server request per lookup). This means that an individual lookup operation can take substantially longer with a client-level cache than without one.
A name cache is usually accompanied by a separate cache of file attributes such as permissions, file size, etc. The attributes in the attribute cache are typically managed separately from entries in the name cache, resulting in additional server requests.
Some implementations of name caching use a whole-directory approach, meaning that they cache entire directories. This approach may not work well with load-sharing techniques where a single user spawns processes on several machines simultaneously. If those processes work in a single directory then there may be a substantial amount of overhead required to keep the cached directory consistent on the multiple machines. Similarly, highly shared directories such as the UNIX /tmp directory can also add to the overhead of maintaining cache consistency.
In a distributed system that provides access to remote directories, there is a need to maintain the names contained in the directories in a coherent fashion. Most name servers implement coherent caching using expiration timers. In such a scheme, the client can cache an entry only for a time interval T. The client either must renew the entry before the interval expires, or the entry must be discarded from the cache. When a server passes an entry to a client for caching, the server, in effect, makes a promise not to delete the entry on the server in the next T seconds.
Although the algorithm is easy to implement, it is not applicable for systems that need to create and delete entries very frequently. A UNIX directory is an example of a name server that cannot use such scheme because many files can be created or deleted each second, making caching based on expiration times a poor strategy. Systems that frequently create and delete entries cannot use the expiration timer based algorithms and must employ more complicated coherence protocols. Such protocols typically involve two-way communication between the client and server. This communication can be represented by a pair of objects called "provider" and "cache". The "provider" object on the server side handles the name lookup, creation, and deletion requests initiated by the client. The cache object on the client side responds to invalidate requests initiated by the server.
Typical solutions require memory storage at the server that is proportional to the number of directories (or even worse, proportional to the number of directories multiplied by the number of names in those directories). In the prior art, the server maintains perfect knowledge about each name that is looked up by each client and sends invalidates when need be on a per-name basis. This approach is not readily scalable, and can break down for large systems. That is, as a network becomes increasingly large, the amount of memory required to maintain directory coherency using perfect knowledge of the system becomes prohibitive. At the same time, directory coherency must be maintained if a single system image of the large system is to be preserved.
Thus, for a large system, a typical straightforward directory coherence scheme requires using a prohibitively large amount of memory to store information used to maintain the coherency. It is desirable, however, to maintain the correctness criterion of coherent access to directories for these large systems, but at the same time to reduce memory requirements over previous solutions.