The quest to make Network Attached Storage (NAS) scaleable has lead to architectures that depart from a traditional direct-attached storage (DAS) model. The DAS architecture comprises several storage devices attached to a single computer. In emerging NAS architectures (further referenced herein as a NAS clustered architecture) a cluster of computers comprises a NAS gateway. The NAS gateway shares the work of a traditional single-node NAS server. Storage devices are shared among the members of the cluster via a Storage Area Network (SAN).
The NAS clustered architecture is preferred to the traditional single-server architecture for various reasons. The NAS clustered architecture is highly scaleable in two dimensions: the quantity of storage devices that can be used and the number of computing servers performing file system services. Further, the NAS clustered architecture exhibits enhanced fault tolerance that makes it the preferred architecture of future NAS devices.
Although this technology has proven to be useful, it would be desirable to present additional improvements. Network-file access protocols such as, for example, the network file system (NFS) protocols that were traditionally embedded in NAS devices were not designed with such clustered architectures in mind. Consequently, the fault-tolerant file and record locking features supported by those protocols do not work well in the NAS clustered architecture.
One conventional approach to providing fault-tolerant file and record locking features to the NAS clustered architecture assigns ownership of all file and record |locks| to individual servers in the NAS gateway cluster. When a server in the NAS gateway receives a lock request, the server determines whether another server owns the lock. If another server owns the requested lock, the server receiving the lock request issues a demand-lock request via an inter-cluster message to the server owning the lock to initiate transfer of ownership of the lock to the server that received the current lock request.
The protocol for this approach requires ownership of locks to be transferred via an inter-cluster protocol requiring a set of messages; consequently, this approach entails some network overhead. This approach fails to address issues that appear when the cluster is used as a multi-protocol NAS server platform. Further, this approach does not address lock contention among the various network file system protocols nor does it address server failures and server failure recovery.
Another conventional |approach| forwards lock requests on a given file system to a single server thus avoiding the need for inter-cluster coordination while serving the request. A request received through a server that is not assigned to handle the lock requests for the underlying file system requires forwarding to the proper server, resulting in significant overhead. This approach does not support load balancing. Further, no effort is made by this approach to address multi-protocol support for locking at the cluster servers.
Yet another conventional approach utilizes state information managed by a file server; the state information is maintained among the clients of the distributed system. When a server fails in this approach, the state maintained by the clients is transferred to the backup server. This approach requires that clients maintain knowledge of the identity of a backup server. Clients are required to keep the server state and rebuild that server state on a new server in the case of a server failure. Further, this approach provides no means to fail-back the clients to the original server after recovery from failure.
Presently, there exists no known method for providing a distributed locking solution that works properly for various network file access protocols in the framework of a clustered NAS running on top of cluster file systems. What is therefore needed is a system, a computer program product, and an associated method for preserving state for a cluster of file servers in a cluster file system, in the presence of load-balancing, failover, and fail-back events. The need for such a file and record locking solution for a clustered NAS running on top of a cluster file system has heretofore remained unsatisfied.