A filesystem is a means for organizing data that is stored in a storage device, as a collection of files and directories. In order to present the data as a collection of files and directories, the filesystem maintains structures of metadata. The term metadata, in the context of a filesystem, refers to information that describes volumes, files and directories, but this information is not part of the stored data itself. For example, the following information items describe a file and are considered as part of the file's metadata: a file name, file size, creation time, last access/write time, user id, and block pointers that point to the actual data of the file on a storage device. Information items that compose metadata of a directory mainly include names and references to files and sub-directories included in the directory.
Traditional filesystems utilize two principal data structures for managing metadata. One data structure is for maintaining file metadata (also known as ‘inode’ in Unix-style file systems) while the second data structure is a directory, which is used for storing and maintaining directory content.
The inode is a data structure that stores all the information about a regular file, directory, or other file system object. The inode is typically part of an inode table and is identified by an inode number, which is an index of an entry containing the inode, in the inode table.
Most filesystems uses an inode table, which is either contiguous or scattered among different allocation groups (sub-volumes), for improving performance. Other methods of storing and managing inodes that use B−trees and methods of prepending the inodes to the file data, have been occasionally used as well. Some filesystems (such as NTFS and VXFS—Veritas File system) use internal files to store the inodes.
Traditional Unix and Linux filesystems, such as NFS (Network File System), use inodes and directories that are managed as separate entities. Windows' NTFS (New Technology File System) uses a table named MFT (Master File Table). MFT entries are equivalent to Unix inodes. Most filesystems do not store file's metadata in directory entries and force the filesystem to obtain that information from the inodes.
Directories are implemented as files that contain tuples of file-names and inode-numbers. Some filesystems (e.g. Ext2-Ext4) include additional information in the directory entry like a file type. Most filesystems store each directory in a separate block, preferably closed to the data of the directory. Traversing a whole directory tree (which is executed for backups, copying, virus scans, etc.) is a time consuming operation, because the directories are scattered all over the volume.
Generally, a directory contains 10-20 files and sub-directories, so that the block allocated to the directory is mostly empty. On the other hand, a large directory may be spanned over multiple blocks that are not necessarily contiguous, hence scanning and lookups are much slower in large directories due to non-sequential reading of the non-contiguous blocks.
The Btrfs filesystem (B-TRee File System, a GPL-licensed file system for Linux) uses B−trees to manage the filesystem. Btrfs stores inodes together with the file data and may store small files directly in the B−tree. Directory indexes are kept in a global B−tree but the directory data (the tuples: file-name & inode-number) is actually stored in individual directory objects.
Hierarchical File System (HFS) is a file system developed by Apple Inc. HFS manages a data structure called Catalog File, which is a B−tree that contains records for all the files and directories stored in a volume. There are four types of records in the catalog file: a File Thread Record, a File Record, Directory Thread Record and a Directory entry. Files and directories in the Catalog File are located by a unique Catalog Node ID (CNID). A File/Directory Thread Record stores just the name of the file/directory and the CNID of its parent directory. A File Record stores metadata about the file including its CNID, the size of the file, timestamps, block extents of the data, etc.
The filesystem is associated with a volume that has been initialized for hosting the filesystem. The volume is a collection of blocks on one or more storage devices (e.g. disks). The volume may be all of the blocks on a single storage device, the blocks of a partition, which is a portion of the storage device, or it may even span over multiple storage devices. The files' metadata is generally stored in a dedicated area of the same volume that stores files and directories of the filesystem or otherwise may be stored as a special file within the volume.
A B−tree or a B+tree, which is a type of tree that is commonly used by filesystems for various purposes, represents sorted data in a way that allows for efficient insertion, retrieval and removal of records, wherein each record is identified by a key. An internal node (non-leaf node) includes multiple keys and a leaf node includes data. The number of keys in a node can be minimum n and maximum 2n. If a node has 2n keys, then adding a key to that node can be accomplished by splitting the 2n key node into two n key nodes and adding the new key to the parent node. Each split node has the required minimum number of keys=n. If an internal node and its neighbor (an adjacent node at the same level) each has n keys, and a key is deleted from the internal node, then the internal node is combined with its neighbor.
A file handle is a reference that the filesystem assigns to a file when it is opened. The filesystem uses the file handle for locating the metadata of the file, when accessing the file and the handle is used throughout the session of accessing the file. File handles are typically implemented as a tuple that is composed of three components: (i) File system ID (FSID); (ii) inode number; and (iii) Generation number. The FSID is used to select the filesystem, which identifies a partition or volume. The generation number is used to invalidate the handle, in case the inode gets deleted and recycled. Reassigning the deleted inode to another file includes changing the generation number, while the old file is still open and the host may try to access the file with the number of the reassigned inode but with the old file generation.
An Access Control List (ACL) is a filesystem object that defines file access rights and contains entries that specify individual user or group rights to specific files. An ACL specifies which users are granted access to a file, as well as what operations are allowed. Each entry in a typical ACL specifies a user(s) and an operation that the user is permitted to perform on the file. There are several approaches for storing an ACL: they can be stored in the inode of the file, in a separate block pointed by the inode, in a separate inode, or one file to store all the ACLs, as in Microsoft.