1. Technical Field
The present invention relates in general to node-based file systems and in particular to checking the integrity of node-based file systems. Still more particularly, the present invention relates to reducing latency attributable to reading nodes when validating the integrity of a node-based file system.
2. Description of the Related Art
Node-based file systems, such as the High Performance File System (HPFS) employed by the OS/2 operating system or various UNIX file systems, store information about files in two locations on the storage media. In HPFS, for example, the first location is the directory entry for the file, while the second location is the F-Node for the file. When checking a disk drive containing files under HPFS, both locations must be checked for every file on the drive. In the case of HPFS, directory entries are located in Directory Blocks (DIRBLKs), which HPFS attempts to locate in the seek-center of the disk or partition. For performance reasons, HPFS locates the F-Node for a file near the location on the disk where the file's data resides. Moreover, to avoid fragmenting files, HPFS distributes files across the disk or partition instead of packing them closely together. The result is that F-Nodes tend to be randomly distributed across the disk or partition.
When validating the integrity of HPFS files in recovering a system that may have experienced a power outage or crash, the traditional method for checking an HPFS drive consists of walking the directory structure of the drive and checking the directory entries and their corresponding F-Nodes as they are encountered. Since HPFS attempts to keep the DIRBLKs which contain the directory entries in the center of the drive, the head motion required to go from one DIRBLK to the next should be minimal. However, since the F-Node for each directory entry is conventionally checked when the directory entry is checked, and since F-Nodes are randomly distributed across the disk or partition, any benefit from having the DIRBLKs grouped together in the seek-center of the drive is lost.
The problem of optimizing integrity checking is further complicated when the files for a node-based file system such as HPFS are stored in a redundant array of inexpensive disks (RAID) storage media. In RAID systems, several disks drives are connected in an array where they are combined and coordinated to give the appearance of a single disk drive. Currently there are six accepted standards for RAID implementation, RAID 0 through RAID 5. The most popular RAID implementations, RAID 0 and RAID 5, data is striped across the drives being combined. That is, sequential units of data are stored on different disk drives, thereby spreading the data across all of the drives in the array. This allows the RAID controller to direct read requests to multiple drives simultaneously. However, the traditional method of validating a file system cannot take advantage of the ability to direct multiple input/output (I/O) requests.
It would be desirable, therefore, to devise a method of validating the integrity of a node-based file system which reduces the overhead required to read both elements of directory information. It would further be advantageous if the method could be optimized to take advantage of the ability in RAID systems to issue multiple read requests.