For a variety of reasons, persistent storage devices (such as hard disk drives, to name one example), are designed to store data in “allocation units,” which represent the smallest-sized block of storage that can be set aside to store a particular data chunk. Allocation units range in size depending, often, on both hardware and software configuration. Thus, even if a given data chunk is smaller than the allocation unit, an entire allocation unit is allocated to hold the data. Similarly, if the data chunk is larger than a single allocation unit, two (or more, as appropriate) allocation units will be allocated to hold the data chunk.
In most cases, it is desirable (or required) that all of the data for a particular data chunk is stored in close physical proximity on the storage device (for example, drive performance in reading the data set generally will be better if the drive does not have to read the data from several, physically separate locations). Indeed, in many cases, the storage of a particular data set will require sufficient contiguous space to store the entire data set. Those skilled in the art will appreciate, however, that the nature of persistent storage devices, and how they are typically used, often leads to a situation known as “fragmentation,” in which available allocation units become scattered (or fragmented) across the storage medium, as a result of iteratively writing and deleting data sets of varying sizes. In some cases, this fragmentation, in fact, can lead to a situation in which, although there is plentiful aggregate free space on the storage device, there is insufficient contiguous free space to store a particular data set. Hence, fragmentation can result in the underutilization of the storage device.
Most systems deal with fragmentation by either allocating only fixed-sized pages (when it is not essential that the allocated space is contiguous) or by periodically running a wholesale compaction (often referred to as de-fragmentation). Neither of these solutions is ideal, however. As above, in many applications, allocated space must be contiguous, such that fixed-sized pages are infeasible. Further, a wholesale de-fragmentation operation generally is quite costly from a resource utilization perspective and often requires the device to be effectively out of service while the operation completes. In many environments (such as high-performance and high-availability environments), the costs of wholesale de-fragmentation may be prohibitive.