Various forms of network-based storage systems are known today. These forms include network attached storage (NAS), storage area networks (SANs), and others. Network storage systems are commonly used for a variety of purposes, such as providing multiple users with access to shared data, backing up critical data (e.g., by data mirroring), etc.
A network-based storage system typically includes at least one storage server, which is a processing system configured to store and retrieve data on behalf of one or more client processing systems (“clients”). In the context of NAS, a storage server may be a file server, which is sometimes called a “filer”. A filer operates on behalf of one or more clients to store and manage shared files. The files may be stored in a storage subsystem that includes one or more arrays of mass storage devices, such as magnetic or optical disks or tapes, by using RAID (Redundant Array of Inexpensive Disks). Hence, the mass storage devices in each array may be organized into one or more separate RAID groups.
In a SAN context, a storage server provides clients with block-level access to stored data, rather than file-level access. Some storage servers are capable of providing clients with both file-level access and block-level access, such as Filers made by Network Appliance, Inc. (NetApp®) of Sunnyvale, Calif.
In file servers, data is stored in logical containers called volumes, which may be identical with, or subsets of, aggregates. An “aggregate” is a logical container for a pool of storage, combining one or more physical mass storage devices (e.g., disks) or parts thereof into a single logical storage object, which contains or provides storage for one or more other logical data sets at a higher level of abstraction (e.g., volumes). A “volume” is a set of stored data associated with a collection of mass storage devices, such as disks, which obtains its storage from (i.e., is contained within, and may be coextensive with) an aggregate, and which is managed as an independent administrative unit, such as a complete file system. A “file system” is an independently managed, self-contained, hierarchal set of data units (e.g., files, blocks or Logical Unit Numbers). Although a volume or file system (as those terms are used herein) may store data in the form of files, that is not necessarily the case. That is, a volume or file system may store data in the form of other units, such as blocks or Logical Unit Numbers (LUNs).
One feature which is useful to have in a storage server is the ability to create a read-only, persistent, point-in-time image (RPPI) of a data set, such as a volume or a LUN, including its metadata. This capability allows the exact state of the data set to be restored from the RPPI in the event of, for example, data corruption or accidental data deletion. The ability to restore data from an RPPI provides administrators with a simple mechanism to revert the state of their data to a known previous point in time as captured by the RPPI. Typically, creation of an RPPI or restoration from an RPPI can be controlled from a client-side software tool. An example of an implementation of an RPPI is a Snapshot™ generated by SnapDrive™. SnapDrive is made by NetApp. Unlike other RPPI implementations, NetApp Snapshots do not require duplication of data blocks in the active file system, because a Snapshot can include pointers to data blocks in the active file system, for any blocks that have not been modified since the Snapshot was created. The “active” file system is the current working file system, where data may be modified or deleted, as opposed to an RPPI, which is a read-only copy of the file system saved at a specific time
An example of an RPPI technique which does not require duplication of data blocks to create an RPPI is described in U.S. Pat. No. 5,819,292, which is incorporated herein by reference, and which is assigned to NetApp. The described technique of creating an RPPI (e.g., a Snapshot) does not require duplication of data blocks in the active file system, because the Snapshot can include pointers to data blocks in the active file system, for any blocks that have not been modified since the RPPI was created. (The term “Snapshot” is used in this document without derogation of Network Appliance, Inc.'s trademark rights.) Among other advantages, this technique allows an RPPI to be created quickly, helps to reduce consumption of storage space by RPPIs, and reduces the need to repeatedly update data block pointers as required in some prior art RPPI techniques.
In some instances, it may be desirable to write data to an RPPI. For example, when an RPPI of a dataset (e.g., an active file system, a LUN, etc.) is mounted as a Windows drive for verification purposes, Windows must write file system specific metadata information to the RPPI. One way to achieve this is to use a technique described in U.S. patent application Ser. No. 10/412,478 entitled “Writable Read Only Snapshots”, by Vijayan Rajan and filed on Apr. 11, 2003. A writeable, read-only Snapshot comprises a read-only Snapshot and a writeable virtual disk file (hereinafter “vdisk”) residing in the active file system. The vdisk is a “shadow” image of the Snapshot and, as such, includes an attribute that specifies the Snapshot to be used as the base. A write operation directed to the writeable read-only Snapshot is “trapped”, such that the data associated with the operation is stored in the vdisk in the active file system.
The writeable, read-only Snapshot technique, however, creates at least one problem for storage management tasks. Because the vdisk is created and stored in the active file system, any later-created Snapshot of the active file system will reference the vdisk since it is a part of the active file system. As a result, the later created Snapshot indirectly references the base Snapshot of the writeable, read-only Snapshot. Thus, as long as the later created Snapshot is not deleted, the base Snapshot cannot be removed even when the vdisk has been already deleted. In addition, with more writeable, read-only Snapshots and more regular Snapshots being created, an interdependency relationship is created among these Snapshots, making the management of the Snapshots a complicated task.
Further, the writeable, read-only Snapshot technique is not applicable when the whole file system, including the active file system, is read-only. For example, in a storage mirroring system, the mirror of the source file system is not modifiable unless the modification is for data synchronization between the source file system and the mirror. Therefore, a vdisk cannot be written into the mirror for the purpose of making a Snapshot of the mirror writeable. As a result, the writeable, read-only Snapshot technique does not work in this scenario.