Many contemporary data processing systems consume and/or produce vast quantities of data. Electromechanical devices such as hard disk drives are often used to store this data during processing or for later review. The mechanical nature of many types of mass storage devices limits their speed to a fraction of the system's potential processing speed, so measures must be taken to ameliorate the effects of slow storage.
Mass storage devices are commonly viewed as providing a series of addressable locations in which data can be stored. Some devices (such as tape drives) permit storage locations to be accessed in sequential order, while other devices (such as hard disks) permit random access. Each addressable storage location can usually hold several data bytes; such a location is called a “block.” Common block sizes are 512 bytes, 1024 bytes and 4096 bytes, though other sizes may also be encountered. A “mass storage device” may be constructed from a number of individual devices operated together to give the impression of a single device with certain desirable characteristics. For example, a Redundant Array of Independent Disks (“RAID array”) may contain two or more hard disks with data spread among them to obtain increased transfer speed, improved fault tolerance or simply increased storage capacity. The placement of data (and calculation and storage of error detection and correction information) on various devices in a RAID array may be managed by hardware and/or software.
Occasionally, the entire capacity of a storage device is dedicated to holding a single data object, but more often a set of interrelated data structures called a “filesystem” is used to divide the storage available among a plurality of data files. Filesystems usually provide a hierarchical directory structure to organize the files on the storage device. Note that a file in a filesystem is basically a sequence of stored bytes, so it can be treated identically to a mass storage device for many purposes. For example, a second filesystem can be created in a file on a first filesystem. The second filesystem can be used to divide the storage space of the file among a plurality of data files, all of which reside within the file on the first filesystem. Such nested filesystems can be constructed to an arbitrary depth, although depths exceeding one or two levels are not particularly useful. A file that contains a nested filesystem is called a “container file.”
The logic and procedures used to maintain a filesystem (including its files and directories) within storage provided by an underlying mass storage device or container file can have a profound effect on data storage operation speed. This, in turn, can affect the speed of processing operations that read and write data in files. Thus, filesystem optimizations can improve overall system performance.
Read reallocation is a technique that can improve a storage system's performance on large sequential reads. When a read request calls for many data blocks to be copied from a mass storage device into system memory, the read may proceed faster if the data blocks are located physically near one another and/or in sequential order on the storage device. Prior-art systems recognize the benefit of read reallocation, under the rubric of file defragmentation. FIG. 2A shows how data blocks 210-240 may be arranged on a storage device 200. Blocks labeled 210 are unused, while blocks 220, 230, 240 and 250 contain data in a file. When the data blocks of a file are separated and/or stored out-of-order, as shown in FIG. 2A, the file is said to be “fragmented.” A process that reads the file might cause the storage system to perform four separate read operations to obtain the contents of data blocks 220-250. However, if the file is defragmented by moving the contents of data blocks 220-250 around as shown in FIG. 2B, all the data blocks might be obtained in a single read operation. Even partial defragmentation, shown in FIG. 2C, may provide some benefit. Unfortunately, file defragmentation is a time-consuming process, as blocks must be located, read into memory, and then stored in more nearly sequential locations. If the storage device has little free capacity, it may be necessary to move many blocks from place to place to coalesce free areas. Furthermore, files that change or grow tend to become increasingly fragmented over time, necessitating repeated defragmentation operations.
Techniques to reduce fragmentation without explicit, time-consuming defragmentation cycles, may be useful in improving storage operations.