Existing (synchronous and asynchronous) block replication schemes across storage systems are generally useful for disaster recovery. Specifically, if a source site fails, then a target site can take over the function of the source site. Because the dynamic/volatile server state at the source, including cached data, is not necessarily present at a target site when a disaster causes the source server to fail, a server on the target site must generally perform system or application level recovery of the replicated data before beginning service at the target site. For example, in a logging file system, before a replica of a storage volume at a remote site can be used as a live file system, the file system log must be replayed. Specifically, the data structures on a storage volume are not always mutually consistent as they are being updated; generally, any update might affect more than one stored data structure, and the stored structures cannot all be updated simultaneously. Different architectures for storage applications have different ways of managing this temporary inconsistency: some will use an idempotent operations log to record intended updates, replaying those operations after a restart. Some will instead sweep over all data structures after a restart looking for inconsistencies to repair. In both cases, the objective is to repair the consistency of data structures after a system restart.
Another problem that would ideally be solved by a data replication scheme is that of making data available at multiple sites simultaneously, by having multiple replicas that provide access to read-only versions of the source data. Block replication alone is not sufficient to solve this problem because of the recovery steps necessary to allow a server to provide access to the underlying data. Server software cannot track changes to data in a storage volume it uses, and update its own cache of that data. Having data replicated at multiple sites is valuable because it allows for greater data availability in the face of network failures, e.g., the inability to connect to some but not all of the sites with replicated data.
In contrast, a file system replication scheme such as rsync will allow for concurrent data update and highly-available read-only access, because the data updates are flowing through the file system itself, updating the server cache as well as the disk data. With “rsync,” a process running as a file system application at a source site communicates over a network with a process running as a file system application at a destination site. The process at the source site reads the current state of a set of files and transmits that current state to the process at the destination site, and the process at the destination site adjusts the state of the corresponding files at the destination site to mirror those at the source site. However, such schemes are limited in performance compared to block replication schemes because the latter typically leverages specialized hardware and firmware.
Either block replication or file replication are useable for disaster recovery, as required for the “recovery” step after a crash making use of the data copy available. However, block replication does not by itself increase availability or load-balancing, which is possible with file system replication, but with less performance than available with block replication.
It is therefore a challenge for the computer industry to develop techniques for exploiting the advantages of block-level replication schemes to implement file system replication schemes.