A storage is computer-readable media capable of storing data in blocks. Storages face a myriad of threats to the data they store and to their smooth and continuous operation. In order to mitigate these threats, a backup of the data in a storage may be created to represent the state of the source storage at a particular point in time and to enable the restoration of the data at some future time. Such a restoration may become desirable, for example, if the storage experiences corruption of its stored data, if the storage becomes unavailable, or if a user wishes to create a second identical storage.
A storage is typically logically divided into a finite number of fixed-length blocks. A storage also typically includes a file system which tracks the locations of the blocks that are allocated to each file that is stored in the storage. The file system also tracks the blocks that are not allocated to any file. The file system generally tracks allocated and unallocated blocks using specialized data structures, referred to as file system metadata. File system metadata is also stored in designated blocks in the storage.
Various techniques exist for backing up a source storage. One common technique involves backing up individual files stored in the source storage on a per-file basis. This technique is often referred to as file backup. File backup uses the file system of the source storage as a starting point and performs a backup by copying the files to a destination storage. Using this approach, individual files are backed up if they have been modified since the previous backup. File backup may be useful for finding and restoring a few lost or corrupted files. However, file backup may also include significant overhead in the form of bandwidth and logical overhead because file backup requires the tracking and storing of information about where each file exists within the file system of the source storage and the destination storage.
Another common technique for backing up a source storage ignores the locations of individual files stored in the source storage and instead simply backs up all allocated blocks stored in the source storage. This technique is often referred to as image backup because the backup generally contains or represents an image, or copy, of the entire allocated contents of the source storage. Using this approach, individual allocated blocks are backed up if they have been modified since the previous backup. Because image backup backs up all allocated blocks of the source storage, image backup backs up both the blocks that make up the files stored in the source storage as well as the blocks that make up the file system metadata. Also, because image backup backs up all allocated blocks rather than individual files, this approach does not necessarily need to be aware of the file system metadata or the files stored in the source storage, beyond utilizing minimal knowledge of the file system metadata in order to only back up allocated blocks since unallocated blocks are not generally backed up.
An image backup can be relatively fast compared to file backup because reliance on the file system is minimized. An image backup can also be relatively fast compared to a file backup because seeking is reduced. In particular, during an image backup, blocks are generally read sequentially with relatively limited seeking. In contrast, during a file backup, blocks that make up individual files may be scattered, resulting in relatively extensive seeking.
As noted above, each successive image backup of a source storage may include only those blocks of the source storage that were modified subsequent to the point in time of the prior image backup. In order to easily back up only modified blocks during the creation of an image backup, it may be useful to track which blocks are modified between a point in time of a prior image backup and a point in time of a subsequent image backup, instead of determining which blocks are modified by performing a full compare of every block in the source storage with corresponding blocks in image backups that were previously created.
Modifications to a source storage may be tracked while the source storage is accessed by an operating environment, such as an operating system. A record of these modifications may then be saved to persistent storage when the operating system is shut down, and then later loaded from the persistent storage when the operating system is again rebooted, thereby providing persistent modification tracking across reboots of the operating system.
One common problem with persistent modification tracking across reboots of an operating system is a lack of reliability due to multiple operating environments accessing the source storage. For example, if an alternate operating environment, such as a pre-boot virus scanner that is separate from the operating system mentioned above, is granted access to the source storage between the shutdown and reboot of the operating system, the tracking of modifications made to the source storage may not be performed due to the pre-boot virus scanner not having the same tracking capabilities as the operating system. Consequently, modifications to the source storage that are made during the operation of the pre-boot virus scanner may not be tracked and therefore not be reflected in the persistent modification tracking record. Hence, the persistent modification tracking record that is loaded from the persistent storage upon reboot of the operating system may be incomplete because it will be missing modifications made to the source storage by the pre-boot virus scanner, and the creation of any subsequent image backup that is based on this persistent modification tracking record will have a data integrity problem because it will also be missing these modifications.
The subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one example technology area where some embodiments described herein may be practiced.