File systems utilize data structures, also referred to as the file system on-disk format, to maintain and organize data on non-volatile, i.e., persistent, storage, e.g., a volume, disk, hard drive, etc. File systems access and interpret these data structures to store and retrieve data for users and applications, or procedures or computer programs, e.g., when executing software instructions or computer code.
When the file system encounters an error, i.e., a corruption, while trying to access and/or interpret the data structures the file system notes that there is an error and flags the particular storage volume, also referred to herein as volume, as corrupt. Even if the file system attempts to isolate and correct a reported error online, i.e., a self-healing system, some number of encountered corruptions cannot be resolved unless the volume is taken offline. However, while the volume is being processed offline to correct for errors the data and information stored thereon in its data structures is unavailable to any user or other application. These offline periods can be relatively lengthy, further degrading system availability and user satisfaction.
Moreover, when a file system believes it has encountered a corruption while trying to access a volume's data structures, there may be no real existing data structure corruption. In these cases, the error can generally be attributed to other events, e.g., transient errors in the system's volatile memory, transient errors in the system's non-volatile storage, bugs in the file system, etc. However, with existing technology file systems cannot discern between real corruption instances requiring offline processing to remedy and false positive instances when a data structure error is perceived but does not in fact exist.
Thus, it is desirable to verify perceived corruptions prior to taking a volume offline in order to significantly reduce the amount of time any of the file system volumes are offline, and thus unavailable for user and other application access. It would also be desirable to limit the error instances reported to file system administrators and users to those involving verified, i.e., true or real, corruptions.