A file is logical unit of data in a file system. A snapshot of a file is a read-only copy of a file as it existed at a certain time. That is, a snapshot of a file that can be read from and written to (hereinafter referred to as a production file) may be created at a given point in time that reflects the content of the production file at that particular point in time. If the production file is modified after the snapshot is created, the snapshot of the production file remains the same. The snapshot file can be used in numerous ways. For example, if the production file later becomes lost, corrupted, or modified in a way that a user is unhappy with, the snapshot can be used to restore the production file to its state at the time the snapshot was created.
FIG. 1 is a block diagram of an example of a conventional inode for a file in a typical file system. An inode is data structure for a file that is used to store information about the file. Typically, a file system includes an inode for each file in the file system. FIG. 1 illustrates some of the information stored about a file in a conventional inode 180. Inode 180 includes a mode field 181, an access time field 182, a change time field 183, one or more data block pointers 184, and one or more indirect block pointers 185. The mode field 181 specifies the access permissions for the file (e.g., which users can read, write, and/or execute the file), the access time field 182 is used to store the last time the file was accessed, the change time field 183 is used to store the last time that the file was modified, and the data block pointer(s) 184 and indirect block pointer(s) 185 are used to specify the storage locations at which the file data is stored on a storage medium such as a physical disk or logical volume. In particular, data block pointers 184 are pointers to blocks on disk or in a logical volume that store file data, and indirect block pointers 185 are pointers to blocks on disk or in a logical volume that store pointers to data blocks or other indirect blocks.
One possible way to create a snapshot of a file is to create a copy of the inode for the file, create a copy of each data block and indirect block referenced by the file, and modify the data block pointers and indirect block pointers in the copy of the inode to point to the newly created copies of the data blocks and indirect blocks. For example, as shown in FIG. 2, production inode 201 points to two data blocks (i.e., data block 203 and data block 205). To create a snapshot of the file corresponding to inode 201, a copy 207 of production inode 201 may be created and a copies 209 and 211 of data blocks 203 and 205 may be created. The pointers in inode 207 may be modified to point to data blocks 209 and 211 instead of data blocks 203 and 205 (as in the inode for the production file).
One disadvantage to this approach is that duplicate copies of indirect blocks and/or data blocks may be needlessly stored. These duplicate data blocks may unnecessarily consume storage capacity. For example, in the example of FIG. 2, duplicate copies of data block 203 and data block 205 are created when the snapshot is created. If the content of the production file is never modified after the snapshot is created, then the duplicate copies of these data blocks are never needed to restore the production file to its state at the time of creation of the snapshot.
Thus, in a more sophisticated approach to creating a snapshot of a file, a copy of the inode of the production file is created when the snapshot is taken, but data blocks and/or indirect blocks are copied only when the corresponding blocks in the production file are modified. For example, as shown in FIG. 3A, production inode 301 includes pointers to data block 303 and data block 305. When a snapshot is taken, snapshot inode 307 is created by copying production inode 301. Initially, snapshot inode 307 included pointers to data block 303 and data block 305. If the production file is modified, then new blocks may be allocated to store the modified data and the appropriate pointer or pointers in production inode 301 may be modified to point to the new data blocks. For example, as shown in FIG. 3B, if a write occurs that would result in data block 303 being modified, a new data block 309 is allocated to store the modified data block and the pointer to data block 303 in production inode 301 is modified to point to data block 309. Snapshot inode 307 still points to the original data blocks of the production file (i.e., data blocks 303 and 305).
When using such a technique, determining whether blocks that are pointed to by a snapshot inode can be reallocated (i.e., freed) when the snapshot is deleted may present challenges. That is, when a snapshot is deleted, it is desirable to reallocate the blocks pointed to by the snapshot, if no other snapshots or production files are using those blocks.
One prior art technique for determining whether blocks that are pointed to by a deleted snapshot can be freed is described in U.S. patent application Ser. No. 10/668,546, issued as U.S. Pat. No. 7,555,504 to Bixby et al., which is incorporated herein by reference in its entirety. This application describes a technique whereby each block is owned by either a production copy of a file or a snapshot copy. Ownership of a block is designated using a special owner bit (e.g., the most significant bit) in the pointer to the block in the inode of the snapshot or production file. If a snapshot or production file is designated as the owner of an indirect block, then it is considered the owner of all blocks pointed to directly or indirectly by the indirect block. When a snapshot of a production file is deleted, the blocks owned by the snapshot are analyzed. If a block is owned by the snapshot and the corresponding block in the next most recent snapshot of the same production file (or the production file if the snapshot being deleted is the most recent) is not owned by that snapshot, then ownership of the block is passed to the next most recent snapshot. If a block is owned by the snapshot and the corresponding block in the next most recent snapshot of the same production file (or the production file if the snapshot being deleted is the most recent) is owned by that snapshot (or no corresponding block can be found), then the block may be freed.