Block storage devices, such as disk drives and tape drives, are commonly used for storage of computer data.
Block storage devices typically store information in evenly-sized portions, or "blocks." If a data file is smaller than a single block, then a whole block is used to store the data, and the remainder of the block is unused. If a data file is larger than a block, then two or more blocks, which often are not contiguous, can be used to store the data. Again, blocks containing unused space are often allocated to a file in this storage scheme. Block storage devices typically use a writable medium, such as a magnetic or optical medium, to store the computer data, although other forms of electronic memory can also be used.
A single disk or tape device usually has a capacity of many blocks, often into the millions, allowing many data files to be stored. In order to make access to files stored on a block storage device more efficient, these files can be organized into groups known as "directories." In this way, data files having similar content, usage, or characteristics can be grouped together for a user's convenience. Directories are actually data files that contain information specifying where data files are stored on the block storage device. They can often be hierarchical; in other words, directory files can point to other directory files, which in turn point to data files. The location of a file within a directory hierarchy can be specified by means of a "pathname" to the file, which indicates the name of each directory traversed as well as the filename.
In addition to data files and directory files, block storage devices usually contain a small amount of additional information in a "system area." This additional information specifies, among other things, what space on the device is used and what is available for use. When a computer seeks to write information to a block storage device, the system area is accessed to determine where the information can be written without overwriting other information. Similarly, when a computer seeks to read information from a block storage device, the system area is accessed to determine where the desired information was written.
In most traditional file systems, the system area and directories are overhead. That is, the data contained there has essentially no intrinsic value; it is used to track the location of data on the block storage device. Accordingly, it is useful to minimize the impact of the system information on storage capacity. Traditional file systems often have system overhead in the range of 2-3% of capacity. In other words, 2-3% of a given block storage device is devoted to system data, directories, and other information, and is unavailable for use in data storage.
Although the above characterizations of how data is stored on block storage devices are generally true, it should be recognized that a number of specific formats for utilizing the foregoing data types are presently known and used, as will be discussed in detail below.
The data stored on a block storage device can be damaged in a number of ways. An errant computer program can accidentally write information to one or more previously allocated blocks. A power aberration can cause a write operation to be only partially completed, or can cause a computer to write inaccurate data. Moreover, mechanical or electronic failure of a block storage device is possible. Magnetic storage devices are particularly susceptible to environmental factors, such as temperature and electromagnetic fields.
While certain known steps can be taken to prevent many causes of failure, some errors are considered to be inevitable. Accordingly, a need exists to be able to repair damage when it occurs.
One format for the storage of data on block storage devices is known as the "FAT" ("File Allocation Table") file system. The best-known implementation of a FAT system is used on PC-compatible computers by Microsoft MS-DOS and Microsoft Windows 95, although other computers and operating systems use similar systems. On a disk using the FAT file system, the system area contains a "root," or highest level directory containing information on other directories and data files on the disk. The root directory can have a number of directory entries, each of which contains information on a single directory or data file, including its name and a number corresponding to where the file begins on the disk. Each individual block on the disk has a unique identification number for this purpose.
The system area of a FAT disk also includes a file allocation table, which is an array of block numbers or "pointers" to locations on the disk holding data belonging to data files. The file allocation table has one entry, capable of holding a number, for each block on the disk. If the block corresponding to an entry in the file allocation table is not allocated to any directory or data file, then the entry contains a unique numeric identifier specifying that condition. If the block is allocated, then the entry contains a number specifying which block is the next one to store a successive portion of the file in question. If no more blocks are needed to store the file, another unique numeric identifier is used to specify that condition.
Accordingly, under the FAT file system, files need not be stored in consecutive blocks on the disk. Consider, as an example, a disk having 10,000 blocks and one desired file, two blocks long. Assume that the first portion of the file is stored in block number 2,395 and the second portion is stored in block number 6,911. A computer desiring to access that file will first check the root directory in the system area of the disk. If it finds the name of the desired file, it will check the number in the root directory entry corresponding to the start of the file. In the present example, that number will be 2,395. Consequently, the computer will access the first part of the file from block number 2,395.
The computer will then access the file allocation table. In entry number 2,395 of the table, the number 6,911 will be stored, indicating that the file continues in block number 6,911, and does not end after the first block. The computer can then retrieve the second part of the file from block number 6,911. The computer will then access the file allocation table again. Entry number 6,911 of the file allocation table will contain a number such as 65,535, indicating that the end of the file has been reached. Since there are only 10,000 blocks on the exemplary disk, it is not possible for data to be stored in block number 65,535.
The foregoing scheme is used for each data file and directory file on the disk. Blocks that are unused can have corresponding file allocation table entries of zero, for example.
As a result, it is apparent that the FAT system is vulnerable to damage. If the file allocation table is damaged, directories should still point to the first block of each file, but remaining portions of the files may be lost. If the file allocation table contains incorrect information, retrieved data files might contain data that in fact belongs to a different file, a phenomenon known as "crosslinking." If the root directory or directory files are damaged, the file allocation table should still contain correct information, and the presence of files on the disk can in principle be ascertained, but there would be no way to determine their names and directories.
Furthermore, because of the hierarchical nature of the directories, it should be noted that damage to an intermediate-level directory can result in the loss of all files in lower level directories.
Another file system is known as "HPFS," the "High Performance File System." HPFS was originated by Microsoft and adopted by IBM for use with the OS/2 operating system for PC-compatible computers. HPFS does not use a file allocation table to indicate how files are linked together. Rather, each directory entry points to an "Fnode," or "File node," which contains a list of blocks used by the file. The Fnode also contains the filename for the file. Information on whether or not disk space is allocated is maintained in "bitmaps," small data structures which reflect only whether blocks are in use, and not any information on how particular files are allocated. Accordingly, under HPFS, file allocation information is spread throughout the disk, rather than being stored in a single system area.
HPFS is therefore somewhat more resistant to damage than the FAT system. Any damage to the space allocation bitmaps can be corrected by scanning the disk for Fnodes and files. Damage to a particular directory entry can sometimes be corrected by scanning the disk for Fnodes and reconnecting them to the damaged directory, using the filenames stored in the Fnodes. However, damage to one or more Fnodes can render data files essentially unrecoverable, since there would be no way to determine what blocks belong to which file, and in what order. Moreover, if the root directory or other system areas are damaged, use of the entire disk can be lost.
Microsoft also originated "NTFS," or "New Technology File System," as a successor to HPFS and FAT. NTFS is now supported by Microsoft Windows NT, which runs on PC-compatible and certain other computers. NTFS is similar to HPFS in that file allocation information is not stored in a central file allocation table. However, a Master File Table is used to store system information, root directory information, and small files and subdirectories. Lists of blocks used by larger files and directories are kept with the corresponding directory information, whenever possible.
Consequently, damage to the Master File Table or subdirectories can result in unrecoverable files, since there would be no way to associate data found on the disk with any particular directory. NTFS maintains a backup copy of the first 16 files in the Master File Table, with redundant information, to help alleviate this problem. However, an errant software process can damage both copies of the Master File Table.
Numerous other file systems exist for various types of computers and operating systems. The foregoing discussion of FAT, HPFS, and NTFS is intended to be representative, showing certain drawbacks of common file systems.
Several known methods exist for protecting data from loss or damage resulting from the designs discussed above.
For example, copies of critical data can be made on a second block storage device. This process is known as "backing up" the data. Such backup copies can be made at periodic intervals (like a "snapshot" of the disk) or concurrently (called "mirroring") as data is written to the disk (as in "RAID" redundant disk arrays). However, periodic backups can be tedious, interfering with regular use of the computer for a period of time while the backup is occurring, and requiring user intervention to insert backup media. Periodic backups might also "miss" important data, if a backup is scheduled to occur only after the data has been created and already lost due to a failure. Concurrent backups have the disadvantage that damage caused by an errant software program can damage or destroy the backup copy of the information as well as the original.
A second approach to data protection is found in the "IMAGE" or "MIRROR" program found with Microsoft MS-DOS. With this approach, backup copies of certain critical data structures from the system area are kept in a separate area of the block storage device. Accordingly, if damage occurs to the original structures, the copies can be used to retrieve data from the device. However, this approach has the same disadvantages as full backups. Concurrent mirroring is susceptible to software problems and can degrade performance (as certain data must be written twice), and periodic mirroring can be out-of-date when damage occurs. Moreover, if the location of the image data is not ascertainable (or is not fixed in a known position on the storage device), the image file is useless in repairing a damaged volume.
Accordingly, as indicated above, a need exists for a file system for block storage devices having enhanced capabilities for data recovery in the event of damage to various areas on the storage device. Such a file system must be robust and convenient, and should not significantly degrade system performance.