The present invention relates generally to the field of file system management, and more particularly to file change replication in clustered file systems.
The Wikipedia entry for “Clustered_file_system” as of Apr. 20, 2015 states as follows: “A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. . . . Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance. . . . A shared-disk filesystem uses a storage-area network (SAN) to provide direct disk access from multiple computers at the block level. Access control and translation from file-level operations that applications use to block-level operations used by the SAN must take place on the client node. The most common type of clustered filesystem[ ] is [a] shared-disk filesystem, which—by adding mechanisms for concurrency control—provides a consistent and serializable view of the file system, avoiding corruption and unintended data loss even when multiple clients try to access the same files at the same time. It is a common practice for shared-disk filesystems to employ some sort of a fencing mechanism to prevent data corruption in case of node failures, because an unfenced device can cause data corruption if it loses communication with its sister nodes, and tries to access the same information other nodes are accessing.”
The Wikipedia entry for “Journaling_file_system” as of Apr. 20, 2015 states as follows: “A journaling file system . . . keeps track of the changes that will be made in a journal . . . before committing them to the main file system. . . . Updating file systems to reflect changes to files and directories . . . makes it possible for an interruption (like a power failure or system crash) between writes to leave data structures in an invalid intermediate state. . . . Detecting and recovering from such inconsistencies normally requires a complete walk of [the file system's] data structures. . . . If the file system is large and if there is relatively little I/O bandwidth, this can take a long time and result in longer downtimes if it blocks the rest of the system from coming back online. To prevent this, a journaled file system allocates a special area—the journal—in which it records the changes it will make ahead of time. After a crash, recovery simply involves reading the journal from the file system and replaying changes from this journal until the file system is consistent again.”