Snapshots, also referred to as shadow copies, are commonly used to recreate a state of data storage volumes. A snapshot is a record of the state of a volume at a particular time, e.g., a snapshot time. Snapshots are commonly used to create backup copies of data storage volumes. Different techniques are used for different types of snapshots. For example, some snapshots represent a complete copy of the volume. A backup client creates a backup copy of the volume by transferring the complete copy of the volume as a snapshot to a secure data storage location. Some snapshots are differential and include only the data blocks of a volume that have changed since the snapshot creation time point. For snapshots of this kind, the backup client creates the backup copy by copying the snapshot, as well as any data blocks from the volume that are not represented in the snapshot (i.e., those data blocks that have been modified since the snapshot time).
Creating a full backup copy of a volume using a snapshot or other mechanisms often requires considerable system resources and can take an undesirably long time. For example, a complete copy of a snapshot must be read and copied to a secure backup storage location. When a differential snapshot is used, the backup copy must be recreated from the snapshot and from the in-use volume. A backup can be streamlined by using differential or incremental backup techniques. According to differential techniques, a full backup copy of a volume can be created from a snapshot of the volume. After the full backup copy is created, one or more differential backup copies are created with reference to the full backup copy.
Each differential backup copy includes only data blocks that have been modified since the last full backup copy was created. The backed-up volume can be restored by applying the full backup copy and the most recent differential backup copy. Incremental techniques also utilize a full backup copy. A first incremental backup copy is generated including data blocks that have been modified since the full backup copy was created. Subsequent incremental backup copies include data blocks that have been modified since the most recent incremental backup copy was created. In this way, each incremental backup copy is referenced either to the full backup copy (e.g., the first incremental backup copy) or to a prior incremental backup copy.
Both differential and incremental backup techniques require the backup client to identify data blocks that have changed since the creation of a prior backup copy. This can be accomplished by comparing a hash or checksum of each data block on the volume to a hash of checksum of the equivalent location in the prior backup copy. This process is often computationally expensive, as the entire volume must be read to determine which data blocks have changed. This is accomplished by a change block tracker (CBT) utility which detects all the data blocks (i.e., disk sectors) that have changed during the backup.
The backup client needs to know the data blocks that are modified between the snapshot time of the snapshot used to create the backup copy and the snapshot time of a snapshot that will be used to create a current backup copy. In various embodiments, this can be determined by monitoring lock requests and input/output (I/O) requests directed to volume. Lock requests are instructions to a volume or an associated driver to place the volume in a read-only state. For example, a snapshot utility may direct a lock request to a volume or its driver before taking a snapshot in order to ensure that the volume remains consistent while the snapshot is created. The backup client may identify the data blocks of the volume that have been modified since the reference copy by examining the I/O requests (e.g., write requests) directed to the volume since the lock request corresponding to the snapshot used to create the reference backup copy.
However, in many computer systems, the backup client is not the only application that can request a snapshot. Other applications may also utilize snapshots including, for example, management applications for managing snapshots, fast recovery applications, etc. Further, in many computer systems, a lock request does not indicate the application which requested the associated snapshot of the same volume. This means that the backup client may not be able to determine whether any given lock request is associated with its own snapshot request or a snapshot request from another application. Accordingly, it is desired to have the backup client configured to synchronize change tracking with requested snapshots.
A volume subject to the snapshot receives locks for a period of time when the snapshot is being created. However, an accurate detection of locks is required in order to have them used in the CBT algorithm, which detects the data blocks (i.e., disk blocks or sectors) that have been changed during the backup. As discussed above, the locks can be applied to the volume by different snapshot operations. In the conventional implementation, all of the locks have to be considered and, as a result, some redundant data is saved into a snapshot because the process cannot determine the exact locks relevant for a given snapshot. In other words, the resulting snapshot contains data that did not change during the snapshot, while this data is deemed to have been changed by the CBT. In another scenario, multiple locks can cause data losses, because some locks are not properly considered. Therefore, it is critical to detect the locks corresponding to the current snapshot and optimize the functionality of the CBT.
Accordingly, a method for optimization of lock detection in a change block tracker (CBT) is desired.