In data processing systems, data used by application programs and operating systems is stored in data files on storage devices such as magnetic disk drives. Files are generally treated as linear arrays of data elements, typically bytes. File operations such as reading and writing employ a pointer and a length value to identify a starting location and amount of data to be read from or written to the file.
When data is written to a file, the file system allocates storage space on the storage device to hold the data being written. The storage space of the storage device is divided into uniform allocation units, and the allocation of storage space involves assigning one or more allocation units to store data of corresponding sections of the linear file. A very small file may fit entirely within only one allocation unit, but more generally files require the use of multiple allocation units. The file system maintains a table which maps the linear array of the file to the respective allocation units which store the file data.
It is generally desired that files be stored in sets of contiguous or successive allocation units if possible. Disk drives are relatively slow at providing random access to allocation units, whereas a set of contiguous allocation units can be transferred at very high speed after an initial delay associated with positioning a mechanical transducer. When a file is stored in two or more sets of allocation units that are not contiguous with each other, the file is said to be “fragmented”. Excessive fragmentation can reduce file I/O performance by requiring more of the slow positioning operations for each read or write.
There are known techniques that address the issue of file fragmentation. For example, there are software tools that can be used to assess the level of fragmentation of a disk drive and to effect a re-allocation of the storage space to the data files to reduce the level of fragmentation, a process referred to as “defragmenting”. However, defragmentation is a resource intensive process that can be lengthy and may adversely affect performance of programs accessing the data. There are also steps that can be taken to reduce the tendency toward fragmentation in the first place. Some file systems provide for user control over the size of the storage allocation unit used by the file system, for example, in order to achieve a desired balance between efficient use of storage resources (promoted by smaller allocation units) and low fragmentation (generally promoted by larger allocation units). Some file systems also provide for allocation in increments larger than a single allocation unit in some circumstances, for example in connection with compressed files.