The space in a typical file system such as second extended (ext2) file system is split up in blocks, and organized into block groups, analogous to cylinder groups in the Unix File System. Each block group contains a superblock, the block group bitmap, inode bitmap, followed by the actual data blocks. The superblock contains important information that is crucial to operations of the file system, thus backup copies are made in every block group of each block in the file system. The group descriptor stores the value of the block bitmap, inode bitmap and the start of the inode table for every block group and these, in turn are stored in a group descriptor table.
When a file system is created, data structures that contain information about files are created. Each file has an inode and is identified by an inode number in the file system where it resides. An inode is a data structure on a file system on Linux and other Unix like operating systems that stores all the information about a file except its name and its actual data. FIG. 1 shows an example of ext2 inode architecture and FIG. 2 shows an example of an inode data structure.
As shown in FIG. 2, an inode data structure includes an i_block array for storing entries or links pointing to the corresponding data blocks as shown in FIG. 1. The first 12 entries in this array point directly at the data blocks for a file. The next three entries point to blocks that contain block pointers. The first of these, the “indirect block”, contain pointers to the next several blocks of the file. The next one contains pointers to blocks that themselves contain pointers to the next several blocks of the file. The final entry contains a block that contains pointers to blocks that contain pointers to blocks that contain pointers to blocks of data.
Typically, a file system such as ext2 allocates based on block groups, and does not enforce any relationship between block allocations (although it does try to allocate all of the blocks for a particular file within the same block group as the file's inode).
Reading a very large file may require multiple reads just to find out where the data for the file is stored, and there's no constraint to allocate these blocks in any particular relationship to one another, so they may become scattered all over the disk. The default Linux file system (ext2) uses block groups to keep the contents of a file together, and tries to allocate the data blocks for a file within the same block group as its inode (the map that file system uses to find the data blocks for the file), but this is not always successful.
In addition, the standard practice for UNIX-type file systems is to store almost all of the information about a file in an inode data structure. This data structure contains, among other things, the file's owner and permissions information, size, type, update and access times, and the start of a map of the data blocks that hold the data for the file, as well as pointers to the remainder of that map. The collection of inodes is stored as a fixed-sized linear array, near the beginning of the file system. This makes inode operations very fast and robust, but it does introduce a few inefficiencies.
First, all inodes are the same size, and optimized for small files. Very small files (less than 10 k) waste space in the i_block array, since they have so few blocks. Very large files (larger than 64M) require a three-level lookup to find all of their data blocks, and since the blocks used to perform this lookup have no enforced location in relation to the actual file data blocks, or the inode table itself, just finding a single block near the end of a large file may require reading four blocks from all over the file system, and since they have to be read in sequence (since one block contains a pointer to the next block), the read operation cannot be parallelized across a redundant disk array.
Plus, the number of inodes is fixed at the time the file system is created. There are tools that let a user add inodes to an existing file system, but they require a manual process. A user cannot remove excess inodes from a file system without rebuilding the file system from scratch. It either winds up with too many inodes, which wastes space, or not enough, which makes it impossible to create new files, even if there are unallocated blocks on the file system.