The following disclosure generally relates to file systems and methods for verifying the integrity of data structures.
Conventional file systems can store metadata associated with a file in several related data structures. The data structures, although organized in different ways, can have some data correspondence since they are sourced from the same metadata. However, data transfers into and out of the data structures can introduce correspondence errors and thereby corrupt the data structures.
Conventional file systems can periodically verify correspondence between data structures to identify correspondence errors. One conventional verification process is to count the number of files represented in each of the data structures, and compare the final count. If the counts match, a file system can assume correspondence. One problem with conventional verifications is a high collision rate. A collision occurs when the counts for two data structures are the same, despite the existence of correspondence errors. For example, both data structures can be missing a different file, but have the same count.
Another conventional verification process is to generate a list of files having a certain attribute for each data structure and compare lists. For large and unordered data structures, a large amount of memory is required to store lists of file names associated with an attribute as there can be millions or billions of file names in a data structure. Furthermore, an unordered data structure typically requires significant processing resources to cross-check lists.