As used herein, the term “file” should be interpreted broadly to include any type of data organization whether file-based or block-based. Further, as used herein, the term “file system” should be interpreted broadly as a programmatic entity that imposes structure on an address space of one or more physical or virtual disks so that an operating system may conveniently deal with data containers, including files and blocks. An “active file system” is a file system to which data can be both written and read, or, more generally, an active store that responds to both read and write I/O operations.
A file server is a type of storage server which operates on behalf of one or more clients to store and manage shared files in a set of mass storage devices, such as magnetic or optical storage based mass storage devices. The mass storage devices are typically organized as one or more groups of Redundant Array of Independent (or Inexpensive) Mass storage devices (RAID). One configuration in which file servers can be used is a network attached storage (NAS) configuration. In a NAS configuration, a file server can be implemented in the form of a server, called a filer, that attaches to a network, such as a local area network (LAN) or a corporate intranet. An example of such a server is any of the NetApp Filer products made by Network Appliance, Inc. in Sunnyvale, Calif.
A file server can be used to backup data, among other things. One particular type of data backup technique is known as “mirroring”. Mirroring involves backing up data stored at a primary site by storing an exact duplicate (an image) of the data at a remote secondary site. The purpose is that, if data is ever lost at the primary site, it can be recovered from the secondary site.
In a simple example of a mirroring configuration, a source file server located at the primary site may be coupled locally to a first set of mass storage devices, to a set of clients through a local area network (LAN), and to a destination file server located at a remote site through a wide area network (WAN) or a metropolitan area network (MAN). The destination storage server located at the remote site is coupled locally to a second set of mass storage devices at the secondary site.
The source file server receives various read and write requests from the clients. In a system which handles large volumes of client requests, it may be impractical to save data modifications to the mass storage devices every time a write request is received from a client. The reason for this is that mass storage device accesses tend to take a relatively long time compared to other operations. Therefore, the source file server may instead hold write requests in memory temporarily and save the modified data to the mass storage devices periodically, such as every 10 seconds or at whatever time interval is appropriate. The event of saving the modified data to the mass storage devices is called a “consistency point”. At a consistency point, the source file server saves any data that was modified by the write requests to its local mass storage devices and also triggers a process of updating the data stored at the secondary site to mirror the data stored at the primary site. The process of updating the data at the secondary site is referred to as the “synchronization” or “sync” phase of the consistency point (CP) event, or simply “CP sync”.
In the known prior art, the CP sync phase involves comparing a representation of the active state of a file system stored at the secondary site with a corresponding representation of the active state of the file system stored at the primary site, in order to determine what modifications or changes are required to synchronize the data on the primary and secondary sites. This comparison is computationally intensive. Therefore, it is desirable to avoid having to perform such a comparison.