In a “load balancing” cluster file system, different nodes in the cluster access the same portion or the entirety of the shared file system. Clients of the file system are either randomly connected to a node, or a group of clients are designated to connect to a specific node. Each node may receive a different load of client requests for file services. If a node is experiencing more requests than other nodes, the node may forward the request to a node with a lower load. Ideally, each node should get similar number of file requests from clients.
Because any node participating in the cluster can contain the authoritative state on any given file system object, every node can be a synchronization point for a file. Since two or more nodes may access the same file at the same time, complex distributed concurrency algorithms are needed to resolve any access conflict. These algorithms are hard to write and take years to become fully reliable to function properly in a production environment.
The GPFS file system developed by IBM is an example of a Load Balancing Cluster File System.
In a “load sharing” cluster file system, each cluster node is responsible for serving one or more non-overlapping portions of the cluster file system namespace. If a node receives client requests for data outside the scope of the namespace it is serving, it may forward the request to the node that does service the requested region of the namespace.
Since the server nodes do not share overlapped regions of the file system, only a single server will contain the authoritative state of the portion of the file system it serves, a single synchronization point exists. This removes the need for implementing complex distributed concurrency algorithms.
Load sharing cluster file systems generally provide such things as:
1) High Availability and Redundancy: Because the file system is configured within a cluster, cluster protection and availability are extended to the file system.
2) Reduced complexity: Since each node has exclusive ownership of the filesystem it servers, implementing a load sharing cluster filesystem becomes much simpler compared to a load balancing cluster file system where complex concurrency algorithms are needed for arbitration of shared access by each node to the complete file system.
3) Increased Performance and Capacity: With the partitioning of the filesystem, additional nodes can contribute to the serving of the filesystem thus allowing higher total cluster capacity to serve the clients as well as improved performance under high load. Ability to partition namespace based on need: Allows end users to hierarchically structure data to match the representation of their business needs.
4) Pay as you go horizontal scaling: Namespace partitioning allows capacity and performance to expanded as there is need and in the area where the need is greatest, rather than globally in the cluster.
5) Enable real-time reconfiguration of the namespace: Unlike technologies like DFS where there is no ability to transparently reconfigure the namespace or contact all clients using the namespace, Load Sharing Cluster File Systems maintain statefull information about all connections and are able to provide seamless namespace reconfiguration via server side File Virtualization technology, solving one of the biggest deployment hurdles for technologies like DFS.
Since clients do not know how the cluster namespace is partitioned among the nodes of a cluster file system, the node that exports the entire namespace of the cluster file system will bear the full burden and will get all of the request traffic for the cluster file system. That node must then direct each request to the node that is responsible for the partitioned namespace. This extra hop adds additional latency and introduces a scalability problem. Furthermore, the workload of a Load Sharing Cluster is distributed among the nodes based on how the cluster namespace is partitioned. Certain namespaces may experience more workload than others, creating hotspots in the cluster file system. However, since only one node, the node that owns the partitioned namespace, is allowed to service requests for the partitioned namespace that it is responsible for; other nodes with low workload are not capable of helping nodes that are busy. Finally, reconfiguring the partitioned namespaces among the node usually involves moving data or metadata from one node to another and this data movement is very disruptive. Thus, while it is desirable to provide a load sharing cluster file system, these problems must be resolved first before a Load Sharing Cluster File System becomes practical.
Microsoft DFS allows administrators to create a virtual folder consisting of a group of shared folders located on different servers by transparently connecting them to one or more DFS namespaces. A DFS namespace is a virtual view of shared folders in an organization. Each virtual folder in a DFS namespace may be a DFS link that specifies a file server that is responsible for the namespace identified by the virtual folder, or it may be another virtual folder containing other DFS links and virtual folders. Under DFS, a file server that exports shared folders may be a member of many DFS namespaces. Each server in a DFS namespace is not aware that the file server is a member of a DFS namespace. In essence, DFS creates a loosely coupled distributed file system consisting of one or more file servers that operate independently of each other in the namespace.
DFS uses a client-side name resolution scheme to locate the file server that is destined to process file request for a virtual folder in a DFS namespace. The server that exports the DFS namespace in a root virtual folder, the DFS root server, will receive all the name resolution request traffic destined for the virtual folder.
The clients of a DFS namespace will ask the DFS root server who is the target file server and the shared folder in the file server that corresponds to a DFS virtual folder. Upon receiving the information, the DFS clients is responsible to redirect file requests to the target file server and a new path name constructed from the information obtained from the DFS root server. To reduce the load of the DFS root server, the DFS root server does not keep track of who are the clients of the exported DFS namespace. To further reduce the load, clients keep a cache of the association of a virtual folder and its target server and the actual pathname in the target server. Once the client processes the client-side resolution and connects to the target file server, the DFS server no longer participates in the network IO. Furthermore, if a name is in the client side DFS cache, the client will not contact the DFS root server again for the same name until the cache is stale, usually for about 15 minutes. This methodology allows DFS great efficiency and optimal performance since the client is rapidly connected directly to the target file server.
Due to the client-side nature of the protocol and the fact that a connection is not maintained from the client to the DFS server, configuration changes in the DFS namespace cannot be propagated to the clients, especially since DFS does not maintain a client list. Further complicating the problem, each client also uses a client-side DFS name cache and it will not obtain the up-to-date file location information from the server unless the cache is stale. Therefore, maintaining configuration inconsistency is a big challenge. If configuration consistency is not maintained, even for a small duration, may lead to data corruption. Thus for DFS to become a viable cluster solution, configuration consistency must be maintained at all time.
Generally speaking, “file virtualization” is a method for a computer node to proxy client filesystem requests to a secondary storage server that has been virtually represented in the local portion of the file system namespace as a mounted folder.
A traditional file system manages the storage space by providing a hierarchical namespace. The hierarchical namespace starts from the root directory, which contains files and subdirectories. Each directory may also contain files and subdirectories identifying other files or subdirectories. Data is stored in files. Every file and directory is identified by a name. The full name of a file or directory is constructed by concatenating the name of the root directory and the names of each subdirectory that finally leads to the subdirectory containing the identified file or directory, together with the name of the file or the directory.
The full name of a file thus carries with it two pieces of information: (1) the identification of the file and (2) the physical storage location where the file is stored. If the physical storage location of a file is changed (for example, moved from one partition mounted on a system to another), the identification of the file changes as well.
For ease of management, as well as for a variety of other reasons, the administrator would like to control the physical storage location of a file. For example, important files might be stored on expensive, high-performance file servers, while less important files could be stored on less expensive and less capable file servers.
Unfortunately, moving files from one server to another usually changes the full name of the files and thus, their identification, as well. This is usually a very disruptive process, since after the move users may not be able to remember the new location of their files. Thus, it is desirable to separate the physical storage location of a file from its identification. With this separation, IT and system administrators will be able to control the physical storage location of a file while preserving what the user perceives as the location of the file (and thus its identity).
File virtualization is a technology that separates the full name of a file from its physical storage location. File virtualization is usually implemented as a hardware appliance that is located in the data path between users and the file servers. For users, a file virtualization appliance appears as a file server that exports the namespace of a file system. From the file servers' perspective, the file virtualization appliance appears as just a normal user. Attune System's Maestro File Manager (MFM) is an example of a file virtualization appliance. FIG. 1 is a schematic diagram showing an exemplary switched file system including a file switch (MFM).
As a result of separating the full name of a file from the file's physical storage location, file virtualization provides the following capabilities:
1) Creation of a synthetic namespace                Once a file is virtualized, the full filename does not provide any information about where the file is actually stored. This leads to the creation of synthetic directories where the files in a single synthetic directory may be stored on different file servers. A synthetic namespace can also be created where the directories in the synthetic namespace may contain files or directories from a number of different file servers. Thus, file virtualization allows the creation of a single global namespace from a number of cooperating file servers. The synthetic namespace is not restricted to be from one file server, or one file system.        
2) Allows having many full filenames to refer to a single file                As a consequence of separating a file's name from the file's storage location, file virtualization also allows multiple full filenames to refer to a single file. This is important as it allows existing users to use the old filename while allowing new users to use a new name to access the same file.        
3) Allows having one full name to refer to many files                Another consequence of separating a file's name from the file's storage location is that one filename may refer to many files. Files that are identified by a single filename need not contain identical contents. If the files do contain identical contents, then one file is usually designated as the authoritative copy, while the other copies are called the mirror copies. Mirror copies increase the availability of the authoritative copy, since even if the file server containing the authoritative copy of a file is down, one of the mirror copies may be designated as a new authoritative copy and normal file access can then resumed. On the other hand, the contents of a file identified by a single name may change according to the identity of the user who wants to access the file.        
Cluster file systems may be used to meet strong growth of end user unstructured data needs. Load sharing cluster file system is generally simpler to implement than load balancing cluster file system. Furthermore, a cluster file system that uses partitioned namespace to divide workload among the nodes in a cluster is a better match for the business environment. This is because each organization in a business environment usually has its own designated namespace. For example, engineering department may own the namespace /global/engineering, while the marketing department owns/global/marketing namespace. If engineering needs more resources, engineering namespace may be further partitioned and more nodes are added for engineering, without affecting the marketing department.
DFS is good match for a load sharing namespace. Unfortunately, it is hard to maintain configuration consistency among all clients. It also is not a true cluster and does not provide protection from failure.