A distributed file system (DFS) is a type of file system in which file system resources (i.e., data and metadata) are stored on one or more networked storage devices. A DFS allows these networked storage devices, which collectively represent a distributed storage layer, to be simultaneously accessed by multiple client nodes in a manner that is similar, or identical, to local storage devices. In this way, the file system resources can be transparently shared across the client nodes. Examples of commercially available DFSs include GFS, HDFS, Ceph, and the like.
Generally speaking, existing DFSs can be classified as being symmetric or asymmetric. In a symmetric DFS, all data and metadata are managed by the same file system service(s) (running on either the client nodes or storage server nodes). Stated another way, all data (e.g., I/O) and metadata (e.g., namespace-related) requests are handled using a singular code path that makes use of the same set of compute resources.
In an asymmetric DFS, data and metadata are managed by separate file system services. For example, there may be one or more dedicated metadata managers that are specifically configured to maintain the structural elements of the file system, and all metadata requests are routed to these dedicated metadata managers. Data requests are handled via a different and separate code path (which may run on a separate machine, or on the same machine as the metadata manager(s) but with its own distinct set of compute resources).
One advantage of the asymmetric approach is that, in some cases, the volume of data requests generated by storage clients in a DFS deployment may be significantly greater or less than the volume of metadata requests. For instance, consider a scenario where storage clients perform a large number of reads from existing files, but do not need to create or modify files often. In this scenario, with an asymmetric DFS, the data services can be scaled independently of the metadata services in order to accommodate the heavy load of data read requests. With a symmetric DFS, the compute resources allocated to the combined data/metadata services would need to be scaled in tandem even though the metadata management load is relatively light, resulting in less flexibility and potentially inefficient use of system resources.
However, even with the asymmetric DFS design, there are use cases where the scalability and efficiency of file system services is not ideal. For example, there may be situations where different types of file system metadata are created/accessed at different rates, and/or where storage clients migrate between different physical machines. For these and other similar situations, a more flexible approach for handling file system metadata is desirable.