Distributed file systems offer many compelling advantages in establishing high performance computing environments. One example is the ability to easily expand, even at large scale. Another example is the ability to support multiple unique network protocols. For example, a cluster of nodes operating together to function as a distributed file system can support connections from clients using different network protocols. One storage client can access the distributed file system using the Network File System (“NFS”) protocol, a second using the Server Message Block (“SMB”) protocol, and the third using the Hadoop Distributed File System (“HDFS”) protocol. Not only can different clients access the distributed file system under different protocols, multiple clients of a single protocol can also access the distributed file system.
With multiple clients connecting to a single distributed file system objects, files or directories that exist within the file system may be accessed by more than one client using more than one protocol. However, different protocols have different rules and different processes for how they interact with data residing in the file system. For example, clients using the SMB/NFS protocols can open a file, modify the contents of the file, and save the changes to the file. While some data protection processes (e.g., copy-on-write snapshots or other backup processes) may be in place in the file system that act to preserve the original file for clients to access, these protocols themselves allow data to be overwritten.
This is in contrast to protocols like HDFS that are append only for new writes. For example, after an HDFS client reads a file, if it wishes to make changes to the file and save the changes, the changes are stored separately as new data block(s) and the changes are appended into the file system allowing clients access to the newly modified version through some combination of the original and appended data as well as the original version of the data through the original data blocks that the protocol does not write over. Another object based protocol, for example, Open Stack Swift, provides for explicit versioning of objects, such that a change to any object creates a new object as the next version of the original object, where the original version is still retained by the file system.
Therefore there exists a need to honor the semantics of the underlying protocols in a multi-protocol environment, such that a plurality of network protocols can all work together in one name space, while retaining as much discriminative information associated with individual protocols to make the client experience transparent to other protocols that are accessing the same data.