Data may be stored as unstructured data, for example, in files and directories in a file system. A distributed file system may store multiple copies of a file and/or directory on more than one storage server machine to help ensure that, in case of a hardware failure and/or system failure, the data should still be accessible. If a storage server machine experiences a failure, the storage server machine may be unavailable, but changes can still be made to the data on the copies of the data on the available storage server machines. The data on the storage server machine that is down may be stale, which is data that no longer is a current version of the data. When the failed storage server machine is powered back up, the changes which were made to the other copies of the data should be propagated to the failed storage server machine. The process of updating the stale data on the storage server machine may be known as “self-healing.”
In traditional self-healing solutions, a computing machine typically crawls the entire file system to determine which files and/or directories should be self-healed. The computing machine generally checks the change log for each of the files and/or directories to determine whether a particular file or directory has stale data and should be self-healed. Generally, the conventional solutions of examining the change logs only identifies which directories and/or files should be self-healed at the file and directory level itself. Traditionally, the use of change logs do not provide information at a higher level, such as the volume level, to identify which files and/or directories for a given volume should be self-healed without having to crawl the entire volume and inspect each change log for the files and directories in the volume. Some volumes may have up to two billion files, and the examination of each change log can be a time consuming and resource intensive process.