1. Field of the Invention
The present invention relates to a distributed cluster computer environment and, more particularly, to multiple destination mirroring in such an environment.
2. Background Information
A storage system typically comprises one or more storage devices into which information may be entered, and from which information may be obtained, as desired. The storage system includes a storage operating system that functionally organizes the system by, inter alia, invoking storage operations in support of a storage service implemented by the system. The storage system may be implemented in accordance with a variety of storage architectures including, but not limited to, a network-attached storage environment, a storage area network and a disk assembly directly attached to a client or host computer. The storage devices are typically disk drives organized as a disk array, wherein the term “disk” commonly describes a self-contained rotating magnetic media storage device. The term disk in this context is synonymous with hard disk drive (HDD) or direct access storage device (DASD).
The storage operating system of the storage system may implement a high-level module, such as a file system, to logically organize the information stored on volumes as a hierarchical structure of data containers, such as files and logical units. For example, each “on-disk” file may be implemented as set of data structures, i.e., disk blocks, configured to store information, such as the actual data for the file. These data blocks are organized within a volume block number (vbn) space that is maintained by the file system. The file system may also assign each data block in the file a corresponding “file offset” or file block number (fbn). The file system typically assigns sequences of fbns on a per-file basis, whereas vbns are assigned over a larger volume address space. The file system organizes the data blocks within the vbn space as a “logical volume”; each logical volume may be, although is not necessarily, associated with its own file system.
A known type of file system is a write-anywhere file system that does not overwrite data on disks. If a data block is retrieved (read) from disk into a memory of the storage system and “dirtied” (i.e., updated or modified) with new data, the data block is thereafter stored (written) to a new location on disk to optimize write performance. A write-anywhere file system may initially assume an optimal layout such that the data is substantially contiguously arranged on disks. The optimal disk layout results in efficient access operations, particularly for sequential read operations, directed to the disks. An example of a write-anywhere file system that is configured to operate on a storage system is the Write Anywhere File Layout (WAFL™) file system available from Network Appliance, Inc., Sunnyvale, Calif.
The storage system may be further configured to operate according to a client/server model of information delivery to thereby allow many clients to access data containers stored on the system. In this model, the client may comprise an application, such as a database application, executing on a computer that “connects” to the storage system over a computer network, such as a point-to-point link, shared local area network (LAN), wide area network (WAN), or virtual private network (VPN) implemented over a public network such as the Internet. Each client may request the services of the storage system by issuing file-based and block-based protocol messages (in the form of packets) to the system over the network.
A plurality of storage systems may be interconnected to provide a storage system environment configured to service many clients. Each storage system may be configured to service one or more volumes, wherein each volume stores one or more data containers. Yet often a large number of data access requests issued by the clients may be directed to a small number of data containers serviced by a particular storage system of the environment. A solution to such a problem is to distribute the volumes serviced by the particular storage system among all of the storage systems of the environment. This, in turn, distributes the data access requests, along with the processing resources needed to service such requests, among all of the storage systems, thereby reducing the individual processing load on each storage system.
In order to improve reliability and to facilitate disaster recover in the event of a failure in a distributed system, it is common to “mirror,” i.e., replicate, some or all of the underlying data and/or the file system that organizes that data from a source volume associated with a primary storage system or server to one or more remote storage destinations. To that end, a mirror of the source volume is established and stored as a destination volume at a remote site, making it more likely that recovery is possible in a disaster that may physically damage the main storage location or infrastructure (e.g. floods, power outage, act of war, etc.). The mirror is updated at regular intervals, typically by an administrator in an effort to reproduce the most recent changes to the volume.
The inherent Snapshot™ capabilities of the exemplary WAFL file system are further described in TR3002 File System Design for an NFS File Server Appliance by David Hitz et al., published by Network Appliance, Inc., which is hereby incorporated by reference as though fully set forth herein. Further details are provided in commonly owned U.S. Pat. No. 6,993,539, entitled SYSTEM AND METHOD FOR DETERMINING CHANGES IN TWO SNAPSHOTS AND FOR TRANSMITTING CHANGES TO A DESTINATION SNAPSHOT, filed on Mar. 19, 2002, which is hereby incorporated by reference as though fully set forth herein.
It is noted that “Snapshot” is a trademark of Network Appliance, Inc. It is used for purposes of this patent to designate a persistent consistency point (CP) image. A persistent consistency point image (PCPI) is a point-in-time representation of the storage system, and more particularly, of the active file system, stored on a storage device (e.g., on disk) or in other persistent memory and having a name or other unique identifier that distinguishes it from other PCPIs taken at other points in time. A PCPI can also include other information (metadata) about the active file system at the particular point in time for which the image is taken. The terms “PCPI” and “snapshot” shall be used interchangeably through out this patent without derogation of Network Appliance's trademark rights.
Snapshots are generally created on some regular schedule. This schedule is subject to great variation. In addition, the number of snapshots retained by the filer is highly variable. Under one storage scheme, a number of recent snapshots are stored in succession (for example, a few days worth of snapshots each taken at four-hour intervals), and a number of older snapshots are retained at increasing time spacings (for example, a number of daily snapshots for the previous week(s) and weekly snapshot for the previous few months). The snapshot is stored on-disk along with the active file system, and is called into the buffer cache of the filer memory as requested by the storage operating system or other application. However, it is contemplated that a variety of snapshot creation techniques and timing schemes can be implemented within the teachings of this invention.
One form of snapshot process includes the active file system (e.g., inodes and data blocks) at the primary server being captured and transmitted as a whole over a network (such as the Internet) to a remote storage destination site. Generally, a snapshot is an image, typically read-only, which is a replication of a volume at a point in time. The replicated image is initially stored on one or more storage devices on the primary server. After the snapshot is created and stored, the active file system is reestablished leaving the snapshot version in place for possible future restoration of the file system at previous points in time. The snapshot process is described in further detail in United States Publication No. US 2002/0083037, entitled INSTANT SNAPSHOT, by Blake Lewis et al., now issued as U.S. Pat. No. 7,454,445 on Nov. 18, 2008, which is hereby incorporated by reference as though fully set forth herein, and in U.S. Pat. No. 7,010,553 entitled SYSTEM AND METHOD FOR REDIRECTING ACCESS TO A REMOTE MIRRORED SNAPSHOT, by Raymond C. Chen et al., which is hereby incorporated by reference as though fully set forth herein.
As noted, it is often necessary to update the mirrored system when the active file system on the primary server experiences changes. Typically, a new snapshot of the entire file system is periodically generated and transmitted to each destination. However, it is desirable to transmit incrementally the changes to the file system, instead of the entire file system. In order to update incrementally each snapshot with current changes, the source volume is typically scanned at least one time for each destination that is mirrored in order to find the updates which have not yet been transmitted to that particular destination. This involves multiple scans of the source volume, which consume time and bandwidth of the volume and the server, and further requires keeping track of which version of the file system exists on each destination. Some systems, however, do not provide version support with respect to each data block of the file system and, therefore, it is difficult to determine which snapshot exists on each destination volume in the system.
Thus, there remains a need for a method and system for mirroring a source volume to multiple destinations which reduces the amount of scans performed on the volume while maintaining accurate information about which snapshot exists on the destination prior to attempted replication.