FIG. 1 is a block diagram illustrating non-limiting exemplary architecture of a distributed file system 100 implementing a Network Attached Storage (NAS) in accordance with the prior art. Distributed file server 120 may include a plurality of nodes (aka controllers) 130-1 to 130-x connected to a bus 180 operating in Internet Small Computer Systems Interface (iSCSI), a fiber channel (FC) or the like.
Bus 180 connects distributed file server 120 to a plurality of block storage devices 190 possibly configured as a part of a Storage Area Network (SAN) device aligned, for example, in a Redundant Array of Independent Disks (RAID) configuration.
Each of nodes 130-1 to 130-x may include a central processing unit (CPU) 160-1 to 160-x respectively, and memory units 150-1 to 150-x respectively, on which several processes are being executed. Nodes 130-1 to 130-x may communicate with a plurality of clients over network protocols such as Network File System (NFS).
Some of the processes running over nodes 130-1 to 130-x may include file system daemons (FSDs) 170-1 to 170-x. Each of nodes 130-1 to 130-x may include one or more FSDs which serve as containers for services and effectively control files in distributed file server 120.
Files in distributed file server 120 are distributed across FSDs 170-1 to 170-x and across nodes 130-1 to 130-x. Distributed file server 120 may also include Network File System (NFS) servers 140-1 to 140-x in at least one of nodes 130-1 to 130-x, wherein each of NFS servers 140-1 to 140-x may receive a request 112 from clients such as client 110.
Generally, NFS protocol such as running on distributed file server 120 strives to provide the same POSIX file-access semantics as locally-mounted POSIX file systems do. One difference is the handling of temporary failures. When client 110 removes a file and the reply for the successful remove fails to arrive to the client due to network issues or a failure of distributed file server 120 (just before the reply was sent). In such a case, client 110 may retry the operation, but the second try may report a file-not-found error which might trigger a client application failure. Similar issues can occur with other stateful operations (e.g., OPEN, LOCK).
Network File System (NFS) Version 4 Minor Version 1 Protocol—NFSv4.1 Request for Comments (RFC) 5661 tries to remedy the duplicate-request issue of stateful requests and introduces the slot table mechanism such as slot tables 175-1 to 175-x which are duplicate-reply-cache array (with a pre-negotiated size). Each client is associated with a server-side slot-table object stored on respective slot tables 175-1 to 175-x, and every client request such as 112 is associated a slot in the slot table. Once processing of a request 112 is complete, the reply is cached in its associated slot of slot tables 175-1 to 175-x. When any NFS server 140-1 sees a request 112, it first checks for a match in the slot table 175-1. In a case that NFS server 140-1 finds a ready reply, it may immediately send it back to client 110. The RFC 5661 defines that, if the server can't find a client related slot entry, the operation should fail.
However, RFC 5661 slot table doesn't solve distributed multi-nodes clusters failover scenarios. An exemplary scenario in distributed file server 120 is where the client 110 is connected to controller 130-1, streaming a video file. If controller 130-1 fails for some reason, the client may be redirected to another controller such as controller 130-2. Client 110 may try to proceed the flow from the same point it was disturbed by 130-1 failure. But 130-2 doesn't have client 110 slot table entry. The operation could fail according to RFC 5661, and client 110 may need to close all open files and to restart the session.