The architecture of a distributed file system such as a Hadoop Distributed File System (HDFS) typically has a name node that hosts the file system index, and a cluster of data nodes, each of which hosts units of data called blocks. The name node is the single point of failure that impacts the availability of a HDFS, as the system relies on the file system index hosted by the name node to access the data stored in the data nodes. In order to lessen the impact of an HDFS outage to internal and external users, and directly serve user requests in real time, high availability (HA) can be added to the HDFS name node. An HA architecture allows the main name node to fail over to a backup name node.
Even though in an HA architecture, only one name node can be active and send commands to data nodes, in certain scenarios, a data node may receive commands from name nodes that are not currently active. This anomaly may arise under various circumstances. For example, if one of the network interfaces of the first name node fails, and a decision to change the active name node from the first name node to a second name node is made, the first name node may not be aware of the decision. In this case, the first name node may continue to send commands, and since the second name node is the active or master name node, it may also send commands. In another example, the first name node sends a command to a data node. Soon afterward, a failover occurs from the first name node to a second name node. However, if the command is not received or processed by the data node until after the failover because of a delay, such a situation may result in the data node receiving commands from both name nodes.
Similarly, two data nodes may receive the command to delete a replica of the same unit of data, leading to data loss and other issues. For example, a data block may have two replicas hosted on the first and the second data nodes, respectively, but the desired number of replicas is one. The first name node may be initially active and send a command to the first data node to delete the hosted replica. Immediately after sending this command, the first name node may crash and a failover may occur. The second name node may become active without knowing about of the command issued by the first name node. It may then send a command to the second data node to delete the hosted replica, resulting in the deletion of both replicas.