Network data storage is typically provided by an array of disk drives integrated with large semiconductor cache memory. A file server is used to interface the cached disk array to the network. The file server performs mapping of a network files to logical block addresses of storage in the cached disk array and move data between a network clients and the storage in the cached disk array. The file server use a network block services protocol in a configuration process in order to export to the network client logical volumes of the network-attached storage, which become local pseudo-disk instances. See, for example, Jiang et al., Patent Application Publication US 2004/0059822 A1 published Mar. 25, 2004, entitled “Network Block Services for Client Access of Network-Attached Storage in an IP Network,” incorporated herein by reference. Network clients typically use a network file system access protocol to access one or more file systems maintained by the file server.
Typically the logical block addresses of storage are subdivided into logical volumes. Each logical volume is mapped to the physical storage using a respective striping and redundancy scheme. The data mover computers typically use the Network File System (NFS) protocol to receive file access commands from clients using the UNIX (Trademark) operating system or the LINUX (Trademark) operating system, and the data mover computers use the Common Internet File System (CIFS) protocol to receive file access commands from clients using the MicroSoft (MS) WINDOWS (Trademark) operating system. The NFS protocol is described in “NFS: Network File System Protocol Specification,” Network Working Group, Request for Comments: 1094, Sun Microsystems, Inc., Santa Clara, Calif., March 1989, 27 pages, and in S. Shepler et al., “Network File System (NFS) Version 4 Protocol,” Network Working Group, Request for Comments: 3530, The Internet Society, Reston, Va., April 2003, 262 pages. The CIFS protocol is described in Paul J. Leach and Dilip C. Naik, “A Common Internet File System (CIFS/1.0) Protocol,” Network Working Group, Internet Engineering Task Force, The Internet Society, Reston, Va., Dec. 19, 1997, 121 pages.
The data mover computers may also be programmed to provide clients with network block services in accordance with the Internet Small Computer Systems Interface (iSCSI) protocol, also known as SCSI over IP. The iSCSI protocol is described in J. Satran et al., “Internet Small Computer Systems Interface (iSCSI),” Network Working Group, Request for Comments: 3720, The Internet Society, Reston, Va., April 2004, 240 pages. The data mover computers use a network block services protocol in a configuration process in order to export to the clients logical volumes of network attached storage, which become local pseudo-disk instances. See, for example, Jiang et al., Patent Application Publication US 2004/0059822 A1 published Mar. 25, 2004, entitled “Network Block Services for Client Access of Network-Attached Storage in an IP Network,” incorporated herein by reference.
A storage object such as a virtual disk drive or a raw logical volume can be contained in a file compatible with the UNIX (Trademark) operating system so that the storage object can be exported using the NFS or CIFS protocol and shared among the clients. In this case, the storage object can be replicated and backed up using conventional file replication and backup facilities without disruption of client access to the storage object. See, for example, Liang et al., Patent Application Publication US 2005/0044162 A1 published Feb. 24, 2005, entitled “Multi-Protocol Sharable Virtual Storage Objects,” incorporated herein by reference.
The container file can be a sparse file. As data is written to a sparse file, the size of the file can grow up to a pre-specified maximum number of blocks, and the maximum block size can then be extended by moving the end-of-file (eof). The sharing of file system data blocks conserves data storage for storing files in a file server. The sharing of file system data blocks among versions of a file typically occurs when the file server has a file system based snapshot copy facility that periodically creates snapshot copies of certain production files or production file systems. The sharing of file system data blocks within a file and among unrelated files typically occurs when the file server has a file system based data de-duplication facility that eliminates from the data storage any file system data blocks containing duplicative data content. See, for example, Bixby et al., Patent Application Publication US 2005/0065986 A1 published Mar. 24, 2005, entitled “Maintenance of a File Version Set Including Read-Only and Read-Write Snapshot Copies of a Production File,” incorporated herein by reference.
Snapshot copies are in widespread use for on-line data backup. If a production file becomes corrupted, then the production file is restored with its most recent snapshot copy that has not been corrupted. A file system based snapshot copy facility is described in Bixby et al. U.S. Patent Application Publication 2005/0065986 published Mar. 24, 2005, incorporated herein by reference. When a snapshot copy is initially created, it includes only a copy of the inode of the production file. Therefore the snapshot copy initially shares all of the data blocks as well as any indirect blocks of the production file. When the production file is modified, new blocks are allocated and linked to the production file inode to save the new data, and the original data blocks are retained and linked to the inode of the snapshot copy. The result is that disk space is saved by only saving the difference between two consecutive versions. Block pointers are marked with a flag indicating whether or not the pointed-to block is owned by the parent inode. A non-owner marking is inherited by all of the block's descendants. The block ownership controls the copying of indirect blocks when writing to the production file, and also controls deallocation and passing of blocks when deleting a snapshot copy.
A file system based data de-duplication facility permits a shared file system data block to be linked to more than one inode or indirect block. For example, data de-duplication is applied to a file when the file is migrated into the file server or when new data is written to the file. The new data is written to newly allocated file system data blocks marked as blocks that have not been de-duplicated, and an attribute of the file is set to indicate that a de-duplication process is in progress. Then the data de-duplication process searches a single-instance data store of de-duplicated blocks for a copy of the data in each data block marked as not yet de-duplicated. If a copy is found, then, in the inode or indirect block of the file, a pointer to the block marked as not yet de-duplicated is replaced with a pointer to the copy in the single instance data store, and a reference counter for the data block in the single-instance data store is incremented. If a copy is not found, then the block of new data is marked as de-duplicated and added to the single instance data store. Once the data de-duplication process has been applied to all of the data blocks of the file, then the attribute of the file is set to indicate that the de-duplication process is finished. Whenever a file is deleted, the reference counter for each data block of the file is decremented. Whenever a reference counter is decremented to zero, the storage of the corresponding data block is de-allocated by putting the data block on a free block list so that the storage of the data block becomes available for allocation for receiving new data.
Block ownership information for a snapshot copy facility is maintained by storing respective reference counts for the file system indirect blocks and file system data blocks in the file system block hierarchy, and by storing respective delegated reference counts for the parent-child block relationships in the file system block hierarchy. For each parent-child block relationship, a comparison of the respective delegated reference count for the parent-child relationship to the reference count for the child block indicates whether or not the child block is either shared among parent blocks or has a single, exclusive parent block. For example, if the respective delegated reference count is equal to the respective reference count, then the child block is not shared, and the parent block is the exclusive parent of the child block. Otherwise, if the respective delegated reference count is not equal to the respective reference count, then the child block is shared among parent blocks. As will be further described below, this method of using delegated reference counts for indicating whether a block is either exclusively owned or shared has the advantage of indicating block ownership in a way that is compatible between the snapshot copy facility and the use of reference counts by the data de-duplication facility, and that avoids the updating of reference counts in the metadata of child blocks when a shared indirect block is duplicated or “split” in order to perform a write to a data block depending from the shared indirect block in the file system block hierarchy.
File system based data de-duplication facility is used in conjunction with snapshot copy facility to scale in context of large number of snap copies. When using data de-duplication facility or snapshot copy facility according to the storage technology described above results in sharing of data blocks by multiple files, a set of version files or within a single file. Sharing of the data blocks greatly reduces the amount of physical storage required to store the file system data by maintaining the delegated reference count scheme of version files. When the file system data blocks are relocated within a section of the file system address space (perhaps for replacement of the underlying storage infrastructure), pointers to blocks belonging to version files are updated during this operation to point to newly allocated replacement blocks. Specifically, the relocation operation must lock access to blocks to prevent their contents from being changed during relocation, and must create and hold additional references on both the blocks being relocated and their replacement blocks to prevent either from being prematurely freed or incorrectly considered non-shared. When one or more file system data blocks share data blocks and point to same data block, metadata of all the shared data blocks need to be updated during file system data block relocation operation. Updating metadata for all shared file system data blocks is an I/O intensive operation as it requires reading the metadata of the data block from the storage and performing the write operation.
Read or write access to files and their snapshot copies in a manner described above are considerably slower especially when the data blocks are being relocated. Additionally the technology described above can not accommodate the compression of shared data blocks.
The storage technology described above, in combination with a continuing increase in disk drive storage density, file server processing power and network bandwidth at decreasing cost, has provided network clients with more than an adequate supply of network storage capacity at affordable prices. Increasing the performance by avoiding I/O involved in updating the metadata of every shared file system data block, reducing the time it takes to read data from the file or write data to the file, reducing the time it takes to relocate file system data blocks and to allow advanced operations like compression, encryption of shared file system data blocks would be advancement in the data storage computer-related arts. This is becoming increasingly important as the amount of information being handled and stored grows geometrically over short time periods and such environments add more file systems and data at a rapid pace.