Generally, a file system provides a mechanism for storing and organizing files in such a way that the files can be efficiently and effectively retrieved. A file system may exist on top of a block device, which provides a storage medium for storing the data. In some cases, the block device may include one or more logical devices (i.e., volumes), each of which is abstracted from multiple physical data storage devices. The system may provide a level of abstraction between applications and end users accessing the data and the block devices storing the data. Although only one file system may exist on a given block device, each file system may reside over one or more block devices.
When a file is created, the file system may reserve space for the file on the underlying block device and utilize the reserved space to store the data being written to the file. As the amount of data being written to the block device increases, the file system may reserve and utilize more space from unused regions of the block device. When a file is deleted from the block device, the file system may mark the regions that were utilized by the file as free.
In some implementations of the file system, the recently freed space in the underlying block device may be wasted. In a first implementation where the file system is a thin provisioned system, a greater amount of space may be “advertised” to applications and end users versus the amount of space that is actually available. While thin provisioning may yield a greater percentage of storage utilization, one drawback is that the file system may not efficiently utilize free space in the block device after files have been deleted. For example, as previously described, when a file is deleted from the block device, the file system may mark regions that were utilized by the file as free. However, in a thin provisioned system, when a new file is created, the file system may not write the new file in the previously utilized regions that are now free. Instead, the thin provisioned system may attempt to write the new file in newer regions of the block device that have not been previously utilized in order to avoid fragmentation. If these newer regions are not available, especially if multiple file systems access a given block device, then the file system may receive out-of-space errors.
In a second implementation where the file system takes a snapshot (i.e., point-in-time image) of the underlying volume, the file system may “freeze” the region taken by the snapshot, thereby preventing any new writes to the region, until the corresponding snapshot is deleted. Thus, if the snapshot includes freed space, the file system may unnecessarily freeze the freed space. For example, a volume may store a file that is subsequently deleted. The file system may then take a snapshot of the volume, which includes the region previously occupied by the file, and freeze the volume. When a new file is created, the file system may attempt to write the new file to newer regions of the volume because the previously utilized region contained in the snapshot has been. Thus, the previously utilized region becomes redundant space in the volume.
In a third implementation where the file system takes a snapshot of the underlying volume, new writes that are less than a block size of the block device may require a read-modify-write operation. In the read-modify-write operation, part of the data is read from the snapshot space and, along with the new data, is written to the underlying volume. When a file stored in a given block is deleted, new writes to the block may unnecessarily require the read-modify-write operation, thereby negatively affecting snapshot performance
In these example implementations and others, the file system unnecessarily wastes space in the underlying volume. By reclaiming this wasted space, the file system may avoid out-of-space errors and eliminate redundant space in the underlying volume.
It is with respect to these and other considerations that the disclosure made herein is presented.