Techniques are known to synchronize a remote (“server”) filesystem with a local (“client”) filesystem across a network. The remote filesystem may be a near replica of the local filesystem; for instance, it may represent a recent backup of the local filesystem. To synchronize the remote filesystem with the local filesystem, for example to reflect any changes made to the local filesystem since a last synchronization, it is necessary to update the remote filesystem's structure, namespace, and metadata.
A typical “full” synchronization approach uses maximum network bandwidth but no additional local storage to synchronize the filesystem's structure, namespace, and metadata. The modification time (“mtime”) and size of every file on the local filesystem is compared with the mtime and size of the file on the server. If the file does not exist or its mtime and/or size are different on the server, the client creates the file on the server and synchronizes the file content. The client also updates any other metadata (user ID (“UID”), group ID (“GID”), file permissions, etc.) associated with the file. Any files not specified by the client are deleted by the server. The popular utility “rsync” uses this approach.
A typical “incremental” synchronization approach uses less network bandwidth but some local storage. After a full synchronization, the client stores the current filesystem structure, namespace, and metadata in a “catalog” (typically a database). During an incremental synchronization, for every file, the client queries the catalog database first. If the file is not represented in the catalog or its mtime and/or size are different, the client creates the file on the server and synchronizes the file content. If the file is represented in the catalog and its mtime and size are the same, then its content is assumed to be unchanged, and the client just updates any other metadata associated with the file, if different than represented in the catalog. The client deletes any files on the server that are represented in the catalog but no longer exist on the local filesystem.
Another incremental approach uses about the same amount of network bandwidth but (usually) less local storage. Every operation on every file since the last backup is recorded in a filesystem “journal”. To synchronize the remote filesystem, the journal is essentially played back like a recording. This eliminates the need to store the metadata for every file (since most files never change), but is more complicated and prone to synchronization error.