1. Field of the Invention
The present invention relates to file systems, and deals more particularly with techniques for enabling clients to realize advantages of file system referrals, including a uniform name space and an ability to locate content in a (nearly) transparent manner, even though the content may be dynamically moved from one location to another or replicated among locations.
2. Description of the Related Art
The term “file system” generally refers to collections of files and to utilities which can be used to access those files. Distributed file systems, referred to equivalently herein as network file systems, are file systems that may be physically dispersed among a number of different locations. File access protocols are used to communicate between those locations over a communications network, enabling operations to be carried out for the distributed files. File access protocols are designed to allow a client device to access remotely-stored files (or, equivalently, stored objects or other content) as if the files were stored locally (i.e., in one or more repositories that are local to the client device). The server system performs functions such as mapping requests which use the file access protocols into requests to actual storage repositories accessible to the server, or alternatively, returning network location information for requested content that is stored elsewhere.
Example file access protocols include “NFS”, “WebNFS”, and “CIFS”. “NFS” is an abbreviation for “Network File System”. “CIFS” is an abbreviation for “Common Internet File System”. The NFS protocol was developed by Sun Microsystems, Inc. Version 2 of the NFS protocol is documented in Request For Comments (“RFC”) 1094, titled “Network File System” and dated March 1989. A more recent version of the NFS protocol is NFS Version 3, which is documented in RFC 1813, titled “Network File System Version 3” and dated June 1995. (NFS Version 4 is currently under development, and is documented in Internet Draft specification 3010, titled “NFS Version 4 Protocol” and dated November 2001.) “WebNFS” is designed to extend the NFS protocol for use in an Internet environment, and was also developed by Sun Microsystems. CIFS is published as X/Open CAE Specification C209, copies of which are available from X/Open.
When a client device needs to access a remotely-stored file, the client-side implementation of a file access protocol typically queries a server-side implementation for the file. The server-side implementation may perform access control checks to determine whether this client is allowed to access the file, and if so, returns information the client-side implementation can use for the access. Hereinafter, the client-side implementation and server-side implementation will be referred to as the client and server, respectively.
Information specifying the file's location in the distributed file system (e.g., the server on which the file is stored, and the path within that server's storage resources) is used by the client to perform a mount operation for the requested file. A successful “mount” operation makes the file's contents accessible to the client as if stored locally. Information used in performing the mount operation, typically referred to as “mount instructions”, may be stored on the client or may be fetched from a network database or directory (e.g., using a directory access protocol such as the Lightweight Directory Access Protocol, or “LDAP”, or the Network Information Service, or “NIS”).
It is assumed for purposes of discussing the present invention that objects are arranged in a hierarchical tree-like structure, where files are arranged in directories and directories can contain other directories. Access to objects is achieved using path names, where a component of the path name designates a sub-directory in the tree. The path starts at the top of the tree. A common convention uses forward slashes or back slashes to separate sub-directories, and a single slash or backslash at the beginning of the path refers to the top or “root” of the hierarchy. For example, the path “/a/b/C” refers to an object “C” that is in directory “b”. Directory “b” is in directory “a”, which belongs to the root.
After a mount operation, the mounted file system appears to reside within the hierarchical directory structure that defines the client's local file system, at a location within that hierarchical structure that is referred to as a “mount point”. The mount operation allows the hierarchically-structured file systems from multiple sources to be viewed and managed as a single hierarchical tree on a client system.
In some cases, a client will request content directly from the server at which the content is available. However, it may also happen that a client requests content from a server that does not have the content. To handle these latter types of references, individual file systems in a network file system may support referrals to content in other file systems. FIGS. 1A-1D depict examples of such referrals within a network file system. Particularly, with reference to FIG. 1A, file system 106 includes a directory “usr”. The “usr” directory includes a reference to file system “foo”. When a client queries file system 106 for content stored in file system “foo”, the reference will redirect (i.e., “refer”) the client to file system 116.
In effect, referrals enable linking together multiple file systems. Referring to FIG. 1B, the referral from file system 106 is replaced for the client application by the root of the referred file system 116 when accessed by the application. A single name space is formed when the replacement is made, including files locally available on the client system as well as files available from file systems 106 and 116.
The reference illustrated in FIG. 1A may be termed a “hard-coded” reference. For various reasons, file content may be moved from one location to another, such as to a new server. (For example, the previously-used server might fail, or content might be redistributed to alleviate performance bottlenecks, space shortages, and so forth.) When hard-coded references are used, the stored location may therefore become obsolete.
The redirection process is illustrated with reference to FIG. 1C, where file system 106 again includes a directory “usr” and the “usr” directory includes a reference to file system “foo”. Suppose that file system 106 receives a request for file system “foo”, but that “foo” has now moved from file system 116 to file system 126. The hard-coded reference in file system 106 continues to redirect the requester to file system 116. Therefore, file system 116 must include information to redirect the requester to file system 126. To avoid the performance penalty of subsequent references to the now-obsolete location and of processing additional redirections, the hard-coded reference in file system 106 must be changed to indicate the new location of the file content in file system 126.
There may be instances where updating the hard-coded reference in file system 106 is, by itself, insufficient, such that it is necessary to retain the redirection information at file system 116. For example, suppose that a copy of file system 106 has been made, prior to revising the hard-coded reference. This copying process is referred to as “replication”, and may be performed for several reasons, including increased reliability, increased throughput, and/or decreased response time. If file system 106 has been replicated, then multiple copies of the now-obsolete hard-coded link may exist. See, for example, FIG. 1D, where file system 106 again includes a hard-coded reference to file system “foo” which was determined, at some point in time, to be available from file system 116. Further suppose that file system 106 is replicated as file system 136 and also as file system 146, each of which then includes its own reference to file system “foo” in file system 116. If the content identified by the reference moves to file system 126, then simply updating the reference stored on file system 106 is insufficient, as file systems 136 and 146 will contain to use the obsolete reference to file system 116. Therefore, file systems 106, 136, and 146 must all be updated (even if the file systems were intended for read-only access) to include information to redirect the client to file system 126 (or the intermediate link between file systems 116 and 126 must be maintained, with its inherent performance penalties). As will be obvious, this situation is not only inefficient, but also has a high likelihood for error. Maintaining an awareness of each moved file system and/or replication of references is not a viable solution because of its administrative burden.
Referring now to FIGS. 2A and 2B, examples of particular file systems that support referrals will be described. The scenario shown in FIG. 2A is illustrative of processing using version 4 of the NFS protocol, referred to hereinafter as “NFSv4”. Client 202 requests an object “X” from file system (“FS”) server #1 206 (step 1). However, X is a mounted file system which actually exists on FS server #2 216 instead of on FS #1 206. File system server #1 206 is aware of this actual location. NFSv4 requires that each referencing server (i.e., a server which stores a referral to another server) include knowledge of the location and path for each mounted file system in the references returned to its clients. Therefore, FS server #1 206 sends client 202 a redirection message identifying FS server #2 and the path, shown in the example as “/a/b/c/X”, which may be used to find X on FS server #2 (step 2). Next, client 202 uses the information received in the redirection message to access /a/b/c/X on server #2 (step 3).
Note that earlier versions of the NFS protocol do not support referrals or redirection, and thus a down-level NFS client (e.g., a client implementing NFS version 2 or 3) does not understand a redirection message.
A server can send a redirection message that redirects the client to the server itself. This may be useful, for example, when a file system object is moved within a server. In addition, a chain of redirection messages may be used, for example, when an object is moved more than once.
As another example, FIG. 2B depicts an example of operation using the Distributed Computing Environment's Distributed File System (hereinafter, “DCE/DFS”), which is another example of a network file system that allows referrals to remote machines. Using DCE/DFS, client 202 requests an object “X” from FS server #1 206 (step 1). As in the scenario shown in FIG. 2A, suppose that X is a mounted file system existing on FS server #2 216. According to the DCE/DFS protocol, FS server #1 206 sends the client an indirection response. Rather than including the actual location of a referred file system, as in the redirection message in FIG. 2A, the indirection message in FIG. 2B includes an indirect file system identifier (“FSID”), referred to in the examples as “Y”, that may be used by client 202 to find the file system (step 2). After receiving this indirection message, client 202 requests the location of “Y” from a file system location database, or “FSLDB”, 220 (step 3). The FSLDB returns the location of Y, “FS server #2,” to client 202 (step 4). Thereafter, client 202 uses the location of FS server #2 to request the object from FS server #2 216 (step 5).
NFSv4 and similar network file systems require that a referring server (such as FS server #1 206) know the correct locations where clients should be redirected, as stated earlier. An obvious implementation of referrals in NFSv4 and similar network file systems is therefore to embed the locations of the referenced file systems directly in the data stored in the referring file system. However, as described above with reference to FIGS. 1C and 1D, hard-coding references has a number of disadvantages. DCE/DFS avoids these disadvantages by storing only an identifier for the target file system in the referencing file system. The referring file system returns this identifier to the client, and the client then uses it to look up the current location for the file system. In another approach, the related invention defines techniques whereby a referring server having a key stored in a referral object uses that key to perform the lookup operation for the client. This referring server may obtain the actual server location and path for the target (i.e., referred) file system from a database, table, or other storage repository, and then returns the result (or, alternatively, the server location and an encoded FSID representation that is sent instead of a path) to the client. The client then uses this information, sending a new file access request to the identified server location.
Some file access protocols do not support referrals or referral objects. For example, neither NFS version 2 nor NFS version 3 support referrals. The advantages of referrals, and in particular the manner in which referrals enable unification of file systems into a global or uniform name space as well as provide for location transparency of referred file systems, are therefore not available to client devices running these older or “legacy” versions of file access protocols. Some protocols which provide referral support use proprietary implementations. Disadvantages of using proprietary software are well known, and include lack of access to source code, potential interoperability limitations, and so forth.
Accordingly, what is needed are techniques for allowing clients to realize the advantages of referral objects even though the file access protocol used by the client is not specifically adapted for referral objects.