The present invention relates to file systems and, more particularly, to a file system that is robust with respect to unexpected interruptions such as sudden power loss and that is self-maintaining.
File systems, that enable computer applications to handle data, exist in all computer systems, and are generally supplied as part of the operating system. File systems, generally, specify a format and structure for data residing on a storage medium (e.g. a magnetic disk), provide an interface to the medium driver to handle physical data I/O, provide an interface to applications to perform data handling operations, (such as creating a file, reading data from files, and searching in directories), and provide the algorithms and procedures of physical data I/O requests by the applications.
Many different types of file systems are used in computer systems, each file system providing a different way of organizing and handling data. However, one type of file system, the DOS-FAT file system, is exceptionally common. This file system was originally developed for Microsoft's DOS™, and is now used in all Microsoft Windows™ operating systems. Because DOS-FAT is ubiquitous, most non-Microsoft operating systems (e.g. linux and Apple's Mac-OS™) that have their own file system also support DOS-FAT.
The DOS-FAT file system is also called simply the “FAT” file system or the “DOS” file system. “FAT” is an acronym for File Allocation Table, the central structure of this file system. The file system structure and format has remained very stable since its introduction in the 1980's, although several important additions to it have been made over the years, such as support for long filenames, and the FAT32 variant (to support very large disks).
A key requirement of a file system is that it be reliable and robust. Expected conditions in which the file system is used must not result in loss or corruption of data stored by it. Such conditions include a sudden and unexpected loss of power, or a rebooting of the system, or any similar action that results in file system operations being interrupted in an indeterminate stage. An even more basic requirement is that the file system format itself, as written on the disk, must not be damaged. If a file is being created while power is lost in the system, the fact that the contents of the file are in an indefinite state is often not a problem, as the application creating it may be rerun. But if such a mishap will result in the contents of directories being damaged or lost, the damage will be more pervasive and possibly irreversible, as a large part of the storage medium (or the entire storage medium) may become inaccessible.
The DOS-FAT file system is extensively documented in many places, for example, Ray Duncan, Advanced MSDOS Programming, Second Edition, Chapter 10: Disk Internals (Microsoft Press, 1988). The following aspects of DOS-FAT are the ones that are most relevant to the present invention:
A DOS-FAT storage medium is physically divided into sectors (traditionally of 512 bytes each). From the file system's point of view, the storage medium is a linear array of sectors, starting from the first sector, sector 0.
The lowermost sectors of the storage medium contain the basic DOS-FAT structures, including the FAT (file allocation table). These are followed by the rest of the storage medium, which contains all the file and directory data and the available free space. This part is divided into allocation units, also called clusters. An allocation unit is the minimum space that can be allocated to a file or directory, and its size is fixed throughout the storage medium. The size of the allocation unit is a multiple of a sector size, e.g. 4 Kbytes (=8 sectors).
The FAT is a table that indicates the status of each allocation unit. A FAT entry may show that an allocation unit is free space, or it may show that the allocation unit is allocated to a file (though it will not show to which file). In the latter case, the FAT entry also indicates what the next allocation unit for the file is, or indicates that this allocation unit is the last allocation unit for the file. This organization leads to a file having a FAT chain: a list of chained entries in the file allocation table showing which allocation units belong to the file and in which order.
For allocation purposes, a directory is just a file, albeit a file with special contents that are recognized as such by the file system. A directory file contains an array of directory entries, each directory entry being of 32 bytes and separated into several fields. Each directory entry describes one file that is in that directory. If long filenames are supported, several directory entries may be used to describe a file. One of the fields in the directory entry is the starting cluster field, indicating the initial allocation unit (cluster) of the file. In this way the directory entry of a file is linked to the file's FAT chain.
Another field in the directory entry is the file size, which indicates the size in bytes of the file.
A file with a long name is described by several directory entries, the last of which is the short-form or legacy directory entry. The legacy directory entry is preceded by one or more directory entries that describe the full name of the file.
Because of this structure, an implementation of a DOS-FAT file system needs to do several things in order to execute a simple file system request. For example, to create a file called MYDATA.TXT with 1000 bytes of data, the file system needs to perform all of the following operations, not necessarily in this order:
1. Find a free directory entry in the parent directory and write a new MYDATA.TXT entry in the free directory entry.
2. Find a free FAT entry and mark the free FAT entry as belonging to a file.
3. Write the 1000 bytes of data to the corresponding allocation unit found in step 2.
4. Set the starting cluster field in the directory entry to the allocation unit number.
5. Set the file size field in the directory entry to 1000.
No matter what order this sequence of operations is done, more than one physical I/O operation is needed to do the operations. Therefore, loss of power may cause this sequence of operations, once begun, to be incomplete, leaving the medium structure in an inconsistent state.
For example, if step 2 is completed and step 4 is not, the FAT now denotes an allocation unit as belonging to a file. However, this allocation unit is nowhere pointed to by any file. The result is that this allocation unit is lost to further allocation, as there is no mechanism to delete or reuse it.
The same applies to the process of deleting a file. File deletion requires the operations of marking the directory entry of the file as deleted, and marking each of the FAT entries in the file's FAT chain as free. No matter in what order these operations are done, an interruption will cause an inconsistency in the medium structures. Deleting the FAT entries first risks leaving the directory entry “alive” so the file is seen as still existing. Furthermore, the entry's starting cluster still points to FAT entries that have now been made available for new allocations, so eventually these entries will be allocated to another file. Deleting the directory entry first avoids this, but risks making the entire FAT chain or part of it inaccessible if the delete process is not allowed to conclude.
These are only two examples of a shortcoming of the DOS-FAT file system with regard to reliability and robustness. These shortcomings stem from the way the DOS-FAT media format is organized. The consequences of these shortcomings take several forms, several of the most common of which are:
1. Space on the storage medium may be marked as allocated, although it does not belong to any file. This is usually called the “lost cluster” effect, as there is a part of the storage medium that becomes “lost” to the file system. If such events occur many times, many lost clusters accumulate and cause medium capacity to diminish. In the file deletion example above, if the directory entry is deleted first, there is a risk that all or part of the FAT chain will become lost clusters.
2. Space on the storage medium may become marked as belonging to more than one file at the same time. This is known as a “cross link”. This may cause several types of failures and data loss at a later stage. In the file deletion example above, if the FAT entries are deleted first, and an interruption leaves the directory entry “alive”, a subsequent allocation of the FAT entries to another file causes a cross link in which two files apparently, and inconsistently, share the same space in the storage medium.
3. Most DOS-FAT systems have several identical copies of the FAT. These copies may become unsynchronized.
Several other failure patterns are also possible, each causing a specific kind of damage or risk to existing data.
These failure modes of the DOS-FAT file system have been well known for a long time, and maintenance tools have been provided to deal with them. Originally, DOS™ supplied a utility called CHKDSK, which could be run at any time by the user to scan a DOS-FAT disk for inconsistencies, and optionally could repair the inconsistencies (often by applying guesswork as to what the correct state should be). In DOS™ version 6.0 and later in Windows™ operating systems, CHKDSK was replaced by SCANDISK, a more sophisticated utility that essentially did the same as CHKDSK.
Running of CHKDSK- or SCANDISK-type utilities are left to the system user to run, i.e. the user is expected to perform maintenance to the disk, and to be able to deduce when such maintenance is necessary. That this is not a satisfactory solution has long been recognized. In most current versions of the Windows™ operating systems, Windows™ automatically offers to run SCANDISK whenever it detects that the system has not been shut down in an orderly manner.
This maintenance utility solution to the problem is even less appropriate for operating systems like Windows CE™, which is used as the operating system of many consumer appliances such as organizers and mobile phones. The user of an appliance expects the appliance to always work well and is either incapable or unwilling to maintain it, even if provided with tools to do so. The need to provide a reliable but self-maintaining file system for such devices is therefore urgent.
File systems that are robust with respect to unexpected interruptions are known. One such file system is the Journaling Flash File System (JFFS). JFFS is simply a log-structured list of nodes on the storage medium. Each node contains information about the associated file and possible file data. If data are present, the node contains a field that indicates the location in the file where data should appear. This prevents new data from overwriting old data. The node also contains information about then amount and location of data to delete from the file. This information is used for truncating files or overwriting selected data within a file. In addition, each node contains information that is used to indicate the relative age of a node. In order to recreate a file, the entire medium is scanned, the individual nodes are sorted in order of increasing version number and the data are processed according to the instructions in each node.
JFFS writes to the storage medium in a cyclic manner. New nodes simply are appended until the end of the storage medium is reached. Before the end of the storage medium is reached, the first block of the storage medium must be freed for use. This is accomplished by copying all valid nodes (i.e. nodes that have not been made obsolete by later nodes) and then erasing the block.
JFFS is robust with respect to unexpected interruptions such as power loss. If the system crashes or experiences an unexpected loss of power, only the last node written might be affected. The affected file can be recreated except for the changes described by the affected node. This robustness comes at the expense of inefficient storage and retrieval of data. The number of bytes required to store a file can be significantly greater than the actual file size.
Another drawback of JFFS is that it is incompatible with DOS-FAT-like file systems, which use separate areas of the storage medium for the DOS-FAT structures and for the data whose storage allocation is described by the DOS-FAT structures.
There is thus a widely recognized need for, and it would be highly advantageous to have, a file system that is both compatible with DOS-FAT-like file systems and robust with respect to unexpected interruptions.