A data volume may be backed up by way of copying the data blocks (e.g., 512-byte blocks) in the data volume. This process, commonly referred to as block level backup, is in contrast to file level backup which is performed by copying data files as units, as opposed to copying data blocks as units. Block level backup may be performed at regular time intervals. For example, a data volume may be backed up every day at 12 midnight.
Once a data volume is fully backed up, during future backup attempts, the back up process may be applied only to those blocks of data that have been changed or updated since the last backup process, thus skipping over the data blocks that have remained unchanged. To keep track of the updated data blocks, a data structure such as a bitmap or a data array may be utilized, where a Boolean bit is used to flag the data blocks according to a data block's update status.
In a simple example, if a data volume includes 10 blocks, then a 10-bit array may be used, where bits F1 through F10 are associated with blocks B1 through B10, respectively. After a full backup, all bits are set to a first value (e.g., 0) indicating that all blocks have been backed up. If a block B3, for example, is updated between time T1 (when the previous backup was performed) and time T2 (when the next backup is performed), then a bit F3 associated with the updated block B3 is set to a second value (e.g., 1) to indicate that B3 needs to be backed up at time T2.
Accordingly, time-specific snapshots of a data volume may be created at time intervals T1, T2 and so on. A snapshot of a data volume at time T2 represents the state of the data stored on the volume at the time T2. For that reason, at the time T2, the data blocks in a target data volume are locked at time T2 so that the data blocks cannot be updated while the snapshot is taken. As such, applications that are attempting to write to the data blocks in a target data volume may experience a delay in performing the write operations until the snapshot process is completed.
In some implementations, instead of delaying the write process, the update data (i.e., new data) that is to be written to a locked data block is stored in a queue implemented in volatile memory or on non-volatile memory (e.g. a disk drive). Data stored in the queue is written to the target data block after the lock is released. The above implementation is associated with substantial overhead because, as mentioned, it requires for the update data to be held in a queue pending a write back to the target data block.
If the queue is implemented in main memory, the system may run out of memory if a large volume of the data being written to the locked data blocks during the snapshot process. If the queue is implemented on the disk drive, a heavy burden may be placed on the system due to the overhead associated with having to track the location of the update data that is stored on disk, reading the update data from disk, and storing the update data in the target data blocks after the snapshot process has ended.
Once the snapshot process ends, a snapshot of the target volume SNT2 is maintained (e.g., a snapshot of the target volume containing blocks B1 to B10 is retained at the time (e.g., T2) when the snapshot was taken). It is noteworthy that if a large number of data blocks have been updated since the time of the last backup (e.g., at time T1), the snapshot process (e.g., at time T2) may take a relatively long time. A faster and more efficient data backup system is desirable.