Many modern day operating systems (OS's) have file systems that include the ability to mount a remote file system at a mount point on the local file system for purposes of giving the impression of a single file system that spans both a local partition and a remote file system. The remote file system might be another partition on the local disk or it might be a remote file system that is accessed by a protocol such as Netware Core Protocol (NCP), Server Message Block Protocol (SMB)/Common Internet File System Protocol (CIFS), or Network File System Protocol (NFS).
Any remotely mounted file system adds significant value and works well when there are no changes in disk hardware or servers that support the remote file systems; however, when there are changes, the amount and scope of the reconfiguration management steps are significant and grow exponentially with the number of nodes and mount points needed.
For example, consider a server “A” that has a remote mount point to server “B.” The mount point on server A includes the identity of server B as well a some sort of location information for B, called “Loc(B).” If Loc(B) ever changes, then the mount point has to be updated. Take for example, if Loc(B) is the IP Address of B and B's address is changed, the mount point on A must be updated. If Loc(B) is the Domain Name Service (DNS) name of B, then B can change its address because the DNS server will then serve up the new address of B and in this case the mount point on A need not be updated. If the location of the file system on B moves, then the mount point on A must be updated. If the file system itself moves from B to C, then the mount point on A must be updated. So, in many cases, if the data on or the location of or even the identity of B ever changes, then the mount points on A must be updated.
Now consider 100 servers, A1 through A100, which all have mount points to B. Whenever there is a change to the data on B or the location of B then every mount point on every server A1 through A100 must be updated. Obviously, this is time consuming and inefficient for an enterprise.
Therefore, in most cases, a distributed file system is a better way to access data from a remote server by using junctions rather than mount points. A junction is a file, which includes a globally unique identifier (GUID) that identifies the location of a storage volume housing the data associated with the junction. The GUID's mapping to a specific volume for the data is referenced as an entry in a database. If the final location of the data ever changes, no junctions on file systems need to be updated, only the single entry for that GUID in the database needs to be updated. One of the benefits and drawbacks of using a distributed file system is that client software is needed to handle the junction. If the server returns a junction as the result of a request to access a file on the file system, the client will look up the junction file obtain the GUID and look in the database and then find out where the file system is located that hosts the volume of that file and then will “follow” that link to the server that has the data.
Take the earlier example of servers A and B, but now there is a junction on A to a file system on B. Client C will ask for a file on A, but if it reaches the junction, A will return the junction to the client C. C will then look up the correct location for the junction in the database (by opening the junction file obtaining the GUID and searching the database for the location), which in this case is a path to a file on B, and then will reference the file on B.
Junctions are generally believed to be better than mount points for several reasons. 1) The data from B never has to flow through A in order to reach C. If C needs files that are on A, C talks directly to A. If C finds that files are on B, it will talk directly to B. 2) The junctions are indirect in that the actual location of the data for a junction is stored in the database (via a GUID) rather than on the file system for A. If changes are made to B or where the data is stored on B, only the data in the database needs to be changed (the mapping to the GUID); no changes are needed for any of the junctions on the file system on A.
Junction resolution will continue to work even if a volume is moved between servers. This can be done using a volume manager move and split operations. If the entire server is moved to a different server, the data on the original target server will need to be moved using the volume manager operations to make sure database will be updated correctly. (Currently move/split operations require source and target volumes to be within the same processing management context.)
Accordingly, a distribute file system adds much value to conventional remote mounting approaches, but one potentially limiting drawback of junctions is that there is client software that is needed on each client to correctly process the junction and to talk to the database. In other words, each of the clients in a distribute file system approach has to be aware of and have software to handle junctions. This creates a support issue for an enterprise, similar to the remote mount discussion presented above.
Consequently, there is a need for improved techniques for accessing remote files.