In network environments where high-availability is a necessity, system administrators are constantly faced with the challenges of preserving data integrity and ensuring availability of critical system components. One critical system component in any computer processing system is its file system. File systems include software programs and data structures which define the use of underlying data storage devices. File systems are responsible for organizing disk sectors into files and directories and keeping track of which sectors belong to which file and which are not being used.
The accuracy and consistency of a file system is necessary to relate applications and data. However, there always exists the potential for data corruption in any computer system and therefore measures are taken to periodically save or back up file server state to allow system recovery in the event of faults or failures.
One method for backing up a file system to collect verified snapshots (‘snaps’) of a consistent file system, and to store the snaps as file system checkpoints. When data corruption is detected, one of the checkpoints can be used for file system recovery.
The checkpoint should be verified for accuracy and consistency prior to storage and/or recovery or restore. File system verification may be performed via a ‘file system check’ (fsck) utility. A variety of operations are performed during fsck; for example file system directory structures and block counts are checked for consistency and data values may be checked for accuracy. The fsck utility is a time consuming task; depending upon the size of the file system, it may take hours to perform because the time required to perform a file system check is generally time linear to the size of data within the file system. Thus, as an increasing number of file system checkpoints are saved, significant processing capability is used to perform fsck on checkpoints which may never be used.