A filesystem is a means for organizing data that is stored in a storage device, as a collection of files and directories. In order to present the data as a collection of files and directories, the filesystem maintains structures of metadata. The term metadata, in the context of a filesystem, refers to information that describes files and directories, but this information is not part of the stored data itself. For example, the following information items describe a file and are considered as part of the file's metadata: a file name, file size, creation time, last access/write time, user id, and block pointers that point to the actual data of the file on a storage device. Information items that constitute metadata of a directory mainly include names and references to files and sub-directories included in the directory.
Traditional filesystems utilize two principal data structures for managing metadata. One data structure is for maintaining file metadata, also known as ‘inode’ (Index node). The inode is a data structure that stores all the information about a regular file, directory, or other file system object. The inode is typically an entry in an inode table and is identified by an inode number, which is an index of an entry containing the inode, in the inode table. The second metadata structure is a directory, which is used for mapping file names to inode numbers. Directories generally include multiple sub-metadata structures called directory entries, each contains a tuple of a file-name and an inode-number.
A hard link is a directory entry that associates a filename with a file on a file system. Creating a hard link has the effect of creating multiple names for the same file. One inode can be pointed by multiple directory entries, including one directory for the original file and one or more directory entries of one or more physical links, wherein each directory entry includes a different file name but the same inode number. Each directory entry may reside under a different directory.
Filesystem integrity in case of a crash is an issue that filesystem designers had to deal with for many years. Early filesystems did not address this issue properly and relied on running an integrity checking program periodically, usually during boot. Modern filesystems address this issue in various ways.
Consistency problems, related to metadata structures, may occur when an inode is not pointed by a directory entry, or otherwise, stale directory entries may point to free inodes or worse, point to the wrong inodes, creating a security hazard and may cause unexpected problems.
The following approaches have been used for maintaining integrity of metadata:
1. Most filesystems use a journal to guarantee atomicity, when performing file operations that involve changes to more than one metadata structure. The journal records the transaction to be executed, before the inode and the directory entry are committed to disk. The incomplete transactions recorded in the journal may be replayed or rolled back after a crash. Journals have the disadvantage of increasing filesystem operation latency and performing additional I/O operations required for generating and writing the transaction to the journal file.
2. An alternative to journaling that uses specialized hardware is the use of NVRAM (Non-Volatile RAM) or a system with UPS that guarantees that changes are not lost after a crash or a power failure.
3. Soft updates is another alternative to journaling filesystem. This technique that was invented by Marshall Kirk McKusick and Gregory R Ganger (“Soft Updates: A Technique for Eliminating Most Synchronous Writes in the Fast Filesystem”, Proceedings of the FREENIX Track: 1999 USENIX Annual Technical Conference, Monterey, Calif., USA, Jun. 6-11, 1999) and was implemented as part of the FFS filesystem on BSD4.4. This technique tries to guarantee that I/O operations are performed in a certain way that ensures that there are never references to invalid data (like a directory entry, which points to a wrong or missing inode). The ordering constraints rule defines that a structure is never pointed to before it has been initialized (e.g., an inode must be initialized before a directory entry points to it). According to this publication, soft updates have better performance than journaling.
4. Log-structured filesystems implement the filesystem as a log and eliminate the need to write to an external log as well as to the directory and inodes. Log structured filesystems were invented by John Ousterhout' as part of the experimental Sprite operating system in the mid-1980s. Log-structures filesystem may make reads much slower, since it fragments files that conventional file systems normally keep contiguous with in-place writes.
Some file systems are programmed to run consistency checks in order to obtain consistency of the file system after a system failure. Typically, after the system restarts and before the file system is mounted for read and write access, the file system executes a complete walk through the file system's data structures. For example, Linux and UNIX systems use fsck (file system consistency check) command to check file system consistency and repair it, while Microsoft Windows equivalent commands are CHKDSK and SCANDISK. This process is time and resource consuming.