Data may be stored as unstructured data, for example, in files and directories in a file system. A distributed file system may store multiple copies (“replicas”) of a file and/or directory on more than one storage server machine to help ensure that, in case of a hardware failure and/or system failure, the data is still be accessible. If a storage server machine experiences a failure, the storage server machine may be unavailable, but changes can still be made to the replicas on the available storage server machines. The replica on the storage server machine that is down may be stale, i.e., no longer have a current version of the data. When the failed storage server machine is powered back up, the changes which were made to the other replicas should be propagated to the replica on the failed storage server machine. The replica on the failed storage server machine can be referred to as a target replica, and an up-to-date replica used for propagating changes can be referred to as a source replica. Because the target replica is out-of-date, it should not be used as a source replica to update any other replica.
The propagation of changes can take some time. For example, when a replica being repaired is a copy of a large virtual machine image file, the repair process may take several minutes. During the repair process, the storage server machine hosting the source replica can become unavailable (e.g., can go down or become disconnected from the network) before the update of the target replica is complete. As a result, the target replica can be out-of-date, which can cause data loss and/or create problems with data consistency in the distributed file system (e.g., if the target replica is used to repair another replica).