A data processing system may store data using multiple devices. Copies of data stored on one device in the data processing system may be stored on one or more other devices of the data processing system such that if one device becomes unavailable, for example due to a power outage or network problem, the data may still be accessed via at least one other device. Accordingly, a data processing system may replicate data entities across multiple devices and keep the replicated data entities synchronized such that backup copies of data entities stored on any one device are available on other devices. Such replication guards against inaccessibility or loss of data should any one device in the data processing system become unavailable.
A data processing system may keep multiple copies of data entities synchronized by ensuring that if a data entity stored on one device is updated, then so are any of its copies stored on other devices. Some data processing systems synchronize data using so-called “lazy propagation” techniques, whereby changes to a data entity and its copies are made so that copies of the data entity are updated after the data entity is itself updated. One example of a lazy propagation technique is a so-called “journaling” technique in which changes to a data entity are recorded in a log and information in the log is used to update copies of the data entity when access to a copy of the data entity is needed. For example, multiple changes to a particular data entity stored on server A may be recorded to a log without updating a copy of the data entity stored on server B. At a later time, when server A becomes inaccessible, the copy of the data entity on server B may be updated based on information in the log such that an up-to-date version of the data entity may be accessed via server B.
Lazy propagation techniques allow for fast updating of data entities because updating does not require waiting for all copies of the data entity to be updated. On the other hand, lazy propagation techniques result in slow failover because when a server storing a set of data entities becomes inaccessible, copies of these data entities must first be updated (e.g., based on information in a log) before access to them may be provided via another server or servers.
Some data processing systems synchronize data using so-called “eager replication” techniques. Unlike lazy propagation where changes to copies of a data entity are made only after the data entity is updated, eager replication involves updating copies of a data entity before updating the data entity itself. For example, prior to making a change to a data entity stored on server A (e.g., a server designated as a “primary” server for the data entity such that all requests to access and/or update the data entity are provided to the primary server), copies of the data entity are updated first and, subsequently, the change is made to the data entity stored on server A.
Updating data entities using conventional eager replication generally takes more time than when using lazy propagation because eager replication involves updating copies of a data entity before updating the data entity itself. On the other hand, since all copies of the data entities are kept synchronized, eager propagation generally allows for quicker failover than when using lazy propagation.