For convenient reference to stored computer data, the computer data is typically contained in one or more files. Each file has a logical address space for addressing the computer data in the file. In a file server, an operating system program called a file system manager assigns each file a unique numeric identifier called a “file handle,” and also maps the logical address space of the file to a storage address space of at least one data storage device such as a disk drive.
Typically a human user or an application program accesses the computer data in a file by requesting the file system manager to locate the file. After the file system manager returns an acknowledgement that the file has been located, the user or application program sends requests to the file system manager for reading data from or writing data to specified logical addresses of the file.
One of the major responsibilities of the file system manager is to manage and allocate storage space. Normally, a file will consist of a collection of extents of storage space. The extents themselves may consist of consistent sized pieces, known as file system blocks, or they may be of various size extents. Larger extents reduce the number of things that must be managed; however, larger extents may be counter-productive to file system features such as thin provisioning, block sharing, or block de-duplication. In addition, very large extents can make it costly to create small files or to use the storage space efficiently in the face of file creations and deletions. When large extents are normally used, the file system usually has mechanisms in place that allow files to be created from smaller entities, when large extents are not available (for example when the file system has aged and become fragmented).
In order to effectively support a variety of file sizes, possibly using various extent sizes, the file mapping is normally accomplished with some form of tree structure. Two commonly used tree structures are the Indirect Block Tree originally introduced in UNIX, while many newer file systems, such as Oracle's Solaris ZFS, use a form of B-tree to keep track of the extents.
A technique known as file versioning maintains read-only versions of a read-write production file by sharing file blocks between the production file and the read-only versions, and performing a copy-on-write to a newly allocated block for the production file when writing to a shared block. Each read-only version is a snapshot of the production file at a respective point in time. Read-only versions can be used for on-line data backup and data mining tasks.
In a copy-on-write file versioning method, the read-only version initially includes only a copy of the inode of the production file. Therefore the read-only version initially shares all of the data blocks as well as any indirect blocks of the production file. When the production file is modified, new blocks are allocated and linked to the production file inode to save the new data, and the original data blocks are retained and linked to the inode of the read-only version. The result is that disk space is saved by only saving the difference between two consecutive versions. If the production file becomes corrupted during a system crash, then typically the most recent read-only version is copied over to the production file in a recovery operation. In this case, there is a loss of the data that was written to the production file since the creation of the most recent read-only version.
One example of a copy-on-write file versioning method is disclosed in Bixby, et al. U.S. Pat. No. 7,555,504 issued Jun. 30, 2009, incorporated herein by reference. A protocol is provided for creating read-only and read-write snapshots, deleting snapshots, restoring the production file with a specified snapshot, refreshing a specified snapshot, and naming the snapshots. Block pointers are marked with a flag indicating whether or not the pointed-to block is owned by the parent inode. The pointed-to block can be either a data block or an indirect block. A non-owner marking is inherited by all of the block's descendants. The block ownership controls the copying of indirect blocks when writing to the production file, and also controls de-allocation and passing of blocks when deleting a read-only snapshot. For example, when the writing to the production file modifies a block pointer in an indirect block that is not owned by the production file, a new indirect block is allocated to the production file, the contents of the original indirect block are copied to the newly allocated indirect block, the block pointer in the newly allocated indirect block is modified, and the original indirect block remains in the snapshot copy that is the owner of the indirect block and in any more recent snapshot copies that may share the indirect block with the owner of the original indirect block.