Data file servers employing a distributed file storage protocol facilitate the storage and access of files across a computer network. Without a distributed file system, file servers each act as autonomous machines on a network, each server being managed individually and having a separate namespace containing an individual set of data. The distributed file system is responsible for linking the various file servers together into one file system, providing a federation of data that is managed as a unit, and a single namespace for all of the data contained therein.
One such distributed file system in widespread use is the Network File System (NFS), version 4. NFS version 4 contains the capability to redirect a client machine to a different server if the resource it is trying to access no longer resides on the server or never resided on the server the client is requesting them from. This allows the replication and migration of file sets of data between multiple servers, with the migration process being seamless and invisible to the client that is attempting to access files within the file sets.
One problem with the current approaches to data migration in NFS version 4 concerns the use of filehandles. The most common form of a filehandle is referred to as “persistent” in that it can be used by a client computer to refer to a file object, at any time, until the file object is deleted. A filehandle is typically constructed by a server using the internal identifiers of the server, filing system, and identifier for the file object, typically referred to as an inode. This type of filehandle is problematic, however, when a system starts to support file migration.
To maintain the validity of existing filehandles, one method of migration in NFS is by performing a low level copy of the filing system, which allows the inode number to be identical on the new server. A filehandle is typically constructed using the inode number, hence if the inode number does not change, it is easy to use the same filehandle after a file has been migrated. This method, however, must be performed by copying the entire file system, block by block, to create a perfect mirror image on the new server.
Another approach is to specify a specific inode number, matching that of the file on the system it is being migrated from, for a copied file in the destination file system to ensure that a filehandle identical to the source file system can be used. This technique, however, only works if the inode number on the destination file system is currently unused, and may not allow the migration of files to a existing data file system. A similar workaround employs a mapping table to translate the inode value of each file on the source file system to each new file on the destination file system, to enable the new system to translate filehandles created on the source system to files on the destination system. The use of such a table, however, requires an extensive number of entries—an entry for every migrated file—which is resource intensive and inefficient.
To facilitate the migration of files, NFS version 4 introduces the use of volatile file handles. These volatile file handles allow the server to inform the client that a filehandle has expired and is no longer valid, which then notifies the client to perform a path lookup to re-discover the new filehandle for the file object. There are various classes of volatile filehandles, but in this context, the most commonly used is a class that causes a server to expire a filehandle when a file object is migrated to a new system. This leads to extensive system activity after the file system is migrated, because the system must use a resource-intensive path lookup to determine the new filehandle. Additionally, a problem exists in situations where a file on the original file system is opened by a user but subsequently becomes deleted or removed. The opened file cannot be looked up by name because it has been unlinked or renamed.
None of these existing approaches provide a flexible and efficient way to seamlessly migrate of a set of files from one server to another. What is needed in the art is a high-performance operation to preserve filehandles and facilitate the efficient operation of data migration.