The information stored in a distributed file system is spread out over multiple, networked, computing devices and is usually made available to multiple client devices. Keeping the stored information self-consistent as it changes and coordinating access to it when multiple devices wish to access, and to potentially change, the information are two of the major functions of a distributed file system. In the past, these functions were performed by a central server that could ensure its own self-consistency. Today, however, serverless architectures that have no central authority are proliferating, but they still need to coordinate access and to ensure that their stored information remains consistent.
Microsoft's FARSITE™ is one example of a serverless, distributed file system. Logically, FARSITE™ acts like a central file server, but physically computation, communication, and storage are distributed among client computers in the system. Some or all of the client devices act together in small groups to perform functions analogous to those performed by the central server in a conventional, server-based, distributed file system. Because they perform the functions of a server, such groups are often referred to as “servers” for purposes of the present discussion. For a detailed discussion of FARSITE™, please see U.S. patent application Ser. No. 10/005,629, “A Serverless Distributed File System,” filed on Dec. 5, 2001, and incorporated herein by reference in its entirety.
For performance reasons, it is advantageous for distributed file systems to perform a large fraction of their operations directly on the client devices that initiate the operations, rather than sending each operation individually back to a server for remote processing. However, this presents a problem in keeping the file-system data consistent: If two clients make conflicting changes to the file-system state at the same time, then the system enters an inconsistent state. One approach to this problem is to create resolution mechanisms that attempt to rectify inconsistencies after they occur. Thus, the distributed file system is allowed to enter an inconsistent state and is subsequently corrected, to whatever extent that is possible. A different approach is to prevent the inconsistency from occurring by restricting the operations performed by each client to those that do not conflict with operations performed by other clients. FARSITE™ takes this latter approach.
FARSITE™ protects file-system consistency by means of leases. A lease grants a client permission to observe or to modify a particular subset of the global file-system data fields. For example, a client may receive a lease that permits it to modify the contents of a particular file. This is commonly called a “write lease.” Alternatively, the client may receive a lease that permits it to observe the contents of a particular file. This is called a “read lease.” To protect consistency, if any client is granted a write lease on the contents of a particular file, then no other client may be granted a read lease on the contents of that same file. By contrast, it is permissible for multiple clients to have read leases on the contents of a single file as long as no client has a write lease on the file. This is known as “single writer, multiple reader” (SWMR) semantics.
When a client attempts an operation, it does not necessarily know what leases it will need for the operation. For example, Microsoft's WINDOWS™ file-system semantics specify that a directory may not be deleted if it contains any files or subdirectories (these are commonly called “children” of the directory). Thus, if a directory has no children (and if several other conditions are met), then the correct response to a delete-directory operation is to delete the directory and to return a success code. On the other hand, if the directory has at least one child, then the correct response to a delete-directory operation is to return an error code and to not delete the directory. The client requesting the delete-directory operation may not initially know whether the directory has any children, so it does not know whether it requires a read lease on every child field (all indicating the absence of a child) or if it requires a read lease on one child field (indicating the presence of a child). Furthermore, since there are other conditions that must hold for this operation to succeed, it is possible that no child leases are required at all, because the operation will fail for an unrelated reason. Therefore client lease requests are said to often have “proof-or-counterexample” semantics. This means that the server is obligated to provide either proof that an operation will succeed (by issuing all of the leases necessary for the successful completion of the operation) or a counterexample showing at least one cause for the operation to fail (this is typically a single read lease). Since the server generally has more information about this than does the client, the server is in a better position than the client to determine which leases the client needs (if any).
Traditionally, if any data field in a distributed file system is leased to clients, then the data field is protected by an SWMR lease or by a degenerate form thereof. Degenerate forms include read-only leases, which protect data that clients are never allowed to modify, and single read/write leases, which do not allow multiple clients even for read access. All of these various classes of leases are called “shared-value leases” because the lease governs access to a single value that is shared among all clients.
However, shared values represent a poor abstraction for some file-system data fields. One example is the set of handles that a particular client has open on a file, which needs to be a protected value for observing WINDOWS™ deletion semantics. In a WINDOWS™-compatible file-system, file deletion follows this procedure:                A client opens a handle to a file and uses this handle to set the “deletion disposition” bit on the file.        Once the deletion disposition is set, no new handles may be opened on the file.        Existing handles on the file can be closed normally, and when the last handle is closed, then the file is unlinked from the file-system directory structure.While the implementation of this procedure is straightforward on a centralized file server, it is more complicated in a distributed file system. When a client closes a handle on a file, the client must know whether it is closing the last handle that any client has open on the file. If an SWMR lease were used for the set of handles open on a file, then deletion would proceed as follows:        When a client closes its locally last handle on a file, and if the deletion disposition (protected by an SWMR lease) is set, then the client must have either a read lease on every other client's handle-set fields (all indicating the absence of an open handle on the file) or a read lease on one particular client's handle-set field (indicating the presence of at least one open handle on the file).        With this information, the client can determine whether it should unlink the file.Using a shared-value lease in this application causes two problems: security and performance. Security suffers because a client can see which handles other clients have open on a file, which is information that the client should not be privy to. Performance suffers because when a client X holds a read lease on the handles open by another client Y, then client Y is unable to concurrently hold a write lease that would enable it to change the handles it has open.        