A storage system is a computer that provides storage service relating to the organization of information on storage devices, such as disks. The storage system may be deployed within a network attached storage (NAS) environment and, as such, may be embodied as a file server. The file server or filer includes a storage operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on the disks. Each “on-disk” file may be implemented as a set of data structures, e.g., disk blocks, configured to store information. A directory, on the other hand, may be implemented as a specially formatted file in which information about other files and directories are stored.
A filer may be further configured to operate according to a client/server model of information delivery to thereby allow many clients to access files stored on a server, e.g., the filer. In this model, the client may comprise an application, such as a database application, executing on a computer that “connects” to the filer over a computer network, such as a point-to-point link, shared local area network (LAN), wide area network (WAN), or virtual private network (VPN) implemented over a public network such as the Internet. Each client may request the services of the file system on the filer by issuing file system protocol messages to the filer over the network.
A common type of file system is a “write-in-place” file system, an example of which is the conventional Berkeley fast file system. In a write-in-place file system, the locations of the data structures, such as inodes and data blocks, on disk are typically fixed. An inode is a data structure used to store information, such as meta-data, about a file, whereas the data blocks are structures used to store the actual data for the file. The information contained in an inode may include, e.g., ownership of the file, access permission for the file, size of the file, file type and references to locations on disk of the data blocks for the file. The references to the locations of the file data are provided by pointers, which may further reference indirect blocks that, in turn, reference the data blocks, depending upon the quantity of data in the file. Changes to the inodes and data blocks are made “in-place” in accordance with the write-in-place file system. If an update to a file extends the quantity of data for the file, an additional data block is allocated and the appropriate inode is updated to reference that data block.
Another type of file system is a write-anywhere file system that does not overwrite data on disks. If a data block on disk is retrieved (read) from disk into memory and “dirtied” with new data, the data is stored (written) to a new location on disk to thereby optimize write performance. A write-anywhere file system may initially assume an optimal layout such that the data is substantially contiguously arranged on disks. The optimal disk layout results in efficient access operations, particularly for sequential read operations, directed to the disks. A particular example of a write-anywhere file system that is configured to operate on a filer is the SpinFS file system available from Network Appliance, Inc. of Sunnyvale, Calif. The SpinFS file system utilizes a write anywhere technique for user and directory data but writes metadata in place. The SpinFS file system is implemented within a storage operating system having a protocol stack and associated disk storage.
Disk storage is typically implemented as one or more storage “volumes” that comprise physical storage disks, defining an overall logical arrangement of storage space. Currently available filer implementations can serve a large number of discrete volumes (150 or more, for example). Each volume is associated with its own file system and, for purposes hereof, volume and file system shall generally be used synonymously. The disks within a volume may be organized as one or more groups of Redundant Array of Independent (or Inexpensive) Disks (RAID). RAID implementations enhance the reliability/integrity of data storage through the redundant writing of data “stripes” across a given number of physical disks in the RAID group, and the appropriate caching of parity information with respect to the striped data. As described herein, a volume typically comprises at least one data disk and one associated parity disk (or possibly data/parity) partitions in a single disk) arranged according to a RAID 4, or equivalent high-reliability, implementation. In other examples, disk storage may be organized in non-RAID configurations including, for example, just a bunch of disks (JBOD). As such, the description of RAID should be taken as exemplary only.
A common feature of a write-anywhere file systems is the ability to create a point in time image of a data container, such as a file system or some subset thereof. One example of the creation of point in time persistent images is described in U.S. Pat. No. 5,819,292, entitled, METHOD FOR MAINTAINING CONSISTENT STATES OF A FILE SYSTEM AND FOR CREATING USER-ACCESSIBLE READ-ONLY COPIES OF A FILE SYSTEM, by David Hitz et al, the contents of which are hereby incorporated by reference. Another example of the creation of point-in-time persistent images of a file system (a “clone”) is a conventional cloning process utilized in file systems, such as the exemplary Spin FS file system available from Network Appliance, Inc. In the Spin FS file system, disk storage is organized into storage pools, which are further divided into virtual file systems (VFS). Each VFS contains, at its top level, a VFS inode that includes pointers to additional data blocks containing inodes and to indirect blocks that, in turn, reference additional data blocks containing inodes. These inodes are, in turn, the top-level data structures of individual files and/or directories within the VFS.
The conventional cloning process for use with the Spin FS File System is described in U.S. Pat. No. 6,868,417, issued on Mar. 15, 2005 entitled, MECHANISM FOR HANDLING FILE LEVEL AND BLOCK LEVEL REMOTE FILE ACCESSES USING THE SAME SERVER, by Michael L. Kazar, et al, the contents of which are hereby incorporated by reference. Here, when a VFS is cloned, all inodes in the VFS are copied to create the clone, including all indirect blocks pointing to (referencing) inodes. These inode blocks referenced by the VFS inode comprise an inode file describing the VFS. The inode file comprises a plurality of inodes, each of which represents a file or directory. A VFS may contain a very large number (e.g., millions or billions) of individual files. Accordingly, the time required to copy each inode of a file (and/or directory) during the cloning process may be on the order of tens of seconds, during which time the file data is inaccessible by clients. Loss of data access for such a relatively long period of time (e.g., tens of seconds) is undesirable, especially in systems wherein a clone is created on a regular basis, e.g., every hour.