Mass data storage systems are used for many purposes, including storing user and system data for data processing, backup and transmission applications. A typical mass storage system includes numerous computer disk drives that cooperatively store data, for example, as a single logically contiguous storage space, often referred to as a volume or a logical unit. One or more such volumes/logical units may be configured in a storage system. The storage system therefore performs much like a single computer disk drive when viewed by a host computer system. For example, the host computer system can access data of the storage system much like it would access data of a single internal disk drive, in essence, without regard to the substantially transparent underlying control of the storage system.
A mass storage system may include one or more storage modules with each individual storage module comprising multiple disk drives coupled to one or more storage controllers. In one typical configuration, a storage module may be coupled through its storage controller(s) directly to a host system as a standalone storage module. Typical storage controllers include significant cache memory capacity to improve performance of the I/O operation. Write requests may be completed when the supplied data is written to the higher speed cache memory. At some later point, the data in cache memory may be flushed or posted to the persistent storage of the storage modules. In a standalone configuration, it is common to enhance reliability and performance by providing a redundant pair of storage controllers. The redundant pair of controllers enhances reliability in that an inactive storage controller may assume control when an active controller is sensed to have failed in some manner.
In another standard system configuration, a storage module may be part of a larger storage network or “cluster.” For a cluster-type architecture, multiple storage modules and corresponding storage controllers are typically coupled through a switched network communication medium, known as a “fabric,” to one or more host systems. This form of storage module system is often referred to as a Storage Area Network (SAN) architecture and the switching fabric is, concomitantly, referred to as a SAN switching fabric. In such a clustered configuration, it is common that all of the storage controllers exchange coherency information and other information for load balancing of I/O request processing and other control information. Such control information may be exchanged over the same network fabric that couples the storage controllers to the host systems (e.g., a “front end” connection) or over another fabric that couples the storage controllers to the storage modules (e.g., a “back-end” connection).
A network storage appliance (e.g., a storage server) is typically a discrete special-purpose computer that provides file services relating to the organization of information on the storage devices of a mass data storage system. The network storage appliance, or “filer,” includes integrated software (firmware) and an operating system that implements a file system to logically organize information, for example, as a hierarchical structure of directories and files on the storage devices (e.g., storage disks). Each “on-disk” file may be implemented as a set of disk blocks configured to store information, such as text; the directory, by comparison, may be implemented as a specially formatted file in which information about other files and directories are stored.
On-disk format representation of some file systems is block-based using, for example, four kilobyte (KB) blocks (“4K block”) and index nodes to describe the files. Index nodes, which are informally referred to as “inodes,” are data structures used to store information, such as metadata, about a file. Information contained in a typical inode may include, e.g., file ownership data, file access permission data, file size, file type, and on-disk location of the data for the file. The file system uses an identifier with an inode number, known as a “file handle,” to retrieve an inode from a disk. The file system also uses metadata files to store metadata describing the layout of its file system. An example of on-disk format structure of one standard file system is described in U.S. Pat. No. 5,819,292 to David Hitz et al., which is incorporated herein by reference in its entirety and for all purposes.
A “snapshot” of a file system captures the contents of the files and directories in the file system at a particular point in time. A conventional snapshot does not use disk space when it is initially created, is typically a virtual read-only file, and is designed so that many different snapshots can be created for the same file system. Unlike some file systems that create a clone of the file system by duplicating the entire inode file and all of the indirect blocks, conventional snapshots duplicate only the inode that describes the inode file. Such snapshots allow users of the file system to recover earlier versions of a file, for example, following an unintended deletion or modification of the file. In addition, the contents of a snapshot can be copied to another storage device or medium to provide a backup copy of the file system. A snapshot can also be copied to another file server and used as a replica. Some file systems include a copy-on-write snapshot mechanism. Snapshot block ownership in such systems is generally recorded by updating the block's entry in a blockmap file, which is a bitmap indicating which blocks are in-use and which are free for use. Additional information regarding snapshot files can be found, for example, in U.S. Pat. No. 6,289,356 B1, which is incorporated herein by reference in its entirety and for all purposes.
One problem with creating a conventional snapshot is the requirement for additional file system metadata and, thus, additional storage space in the active file system to keep track of which blocks the snapshot occupies. This is inefficient both in its use of storage space and in the time needed to create the snapshots. Another problem with conventional snapshots is that, once data has been captured in a snapshot, that data becomes “trapped” and, thus, modifications to trapped data are not allowed. This restriction is also applicable to any metadata holding that information in the flexible volume (flexvol). Both the virtual volume block numbers (VVBNS) and the physical volume block numbers (PVBNS) become trapped, which prevents a reduction in the footprint of the snapshot data. Consequently, it is no longer possible to modify the block numbers which were assigned previously to any files which are trapped in a snapshot in the flexvol. In addition, conventional filing systems do not allow space to be shared among unique data blocks or data to be shared across multiple flexible volumes. There is therefore a need for improved techniques for more quickly and efficiently capturing the contents of the files and directories in a file system at a particular point in time.
The present disclosure is susceptible to various modifications and alternative forms, and some representative embodiments have been shown by way of example in the drawings and will be described in detail herein. It should be understood, however, that the inventive aspects are not limited to the particular forms illustrated in the drawings. Rather, the disclosure is to cover all modifications, equivalents, combinations and subcombinations, and alternatives falling within the spirit and scope of the disclosure as defined by the appended claims.