A storage is computer-readable media capable of storing data in blocks. Storages face a myriad of threats to the data they store and to their smooth and continuous operation. In order to mitigate these threats, a backup of the data in a storage may be created to represent the state of the source storage at a particular point in time and to enable the restoration of the data at some future time. Such a restoration may become desirable, for example, if the storage experiences corruption of its stored data, if the storage becomes unavailable, or if a user wishes to create a second identical storage.
A storage is typically logically divided into a finite number of fixed-length blocks. A storage also typically includes a file system which tracks the locations of the blocks that are allocated to each file that is stored in the storage as well as the locations of allocated blocks which are used by the file system for its own internal on-storage structures. The file system may also track free blocks that are neither allocated to any file nor allocated to any file system on-storage structure. The file system generally tracks allocated and/or free blocks using a specialized on-storage structure stored in the file system metadata (FSM), referred to herein as a file system block allocation map (FSBAM).
Various techniques exist for backing up a source storage. One common technique involves backing up individual files stored in the source storage on a per-file basis. Another common technique for backing up a source storage ignores the locations of individual files stored in the source storage and instead simply backs up all allocated blocks stored in the source storage. This technique is often referred to as image backup because the backup generally contains or represents an image, or copy, of the entire allocated content of the source storage. Using this approach, individual allocated blocks are backed up if they have been changed since the previous backup. Because image backup backs up all allocated blocks of the source storage, image backup backs up both the blocks that make up the files stored in the source storage as well as the blocks that make up the file system on-storage structures such as the FSM. Also, because image backup backs up all allocated blocks rather than individual files, this approach does not generally need to be aware of the file system on-storage data structures or the files stored in the source storage, beyond utilizing the FSBAM in order to only back up allocated blocks since free blocks are not generally backed up.
An image backup can be relatively fast compared to file backup because reliance on the file system is minimized. An image backup can also be relatively fast compared to a file backup because seeking is reduced. In particular, during an image backup, blocks may be read sequentially with relatively limited seeking. In contrast, during a file backup, blocks that make up the content of individual files may be scattered, resulting in relatively extensive seeking.
One common problem that is encountered when repeatedly backing up a source storage using an image backup is the potential for the inclusion of unused blocks in successive backups. For example, a very large digital movie file may initially be stored on a source storage. The allocated blocks that correspond to the movie file may then be stored in an initial backup of the source storage. After the creation of the initial backup, the movie file may then be deleted from the source storage, thus rendering the corresponding blocks as unused blocks. As subsequent versions of the backup of the source storage are subsequently created, the unused blocks corresponding to the deleted movie file may be needlessly retained in one or more of the subsequent versions of the backup. Retaining unused blocks in the subsequent versions of the backup may increase the overall size requirements of a storage where the subsequent versions of the backup are stored and/or increase the processing time associated with restoring the subsequent versions of the backup.
The subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one example technology area where some embodiments described herein may be practiced.