Technical Field
The present invention relates to data storage and, more specifically, to providing checkpoints for data storage replicas.
Description of the Related Art
To increase reliability of a distributed data storage system, data is replicated to two or more nodes. In some occasions, nodes in the distributed system may go offline. For instance, a storage node may go offline due to a failure in a server hosting the node. During the time the node was offline, data in the data storage system may have changed. As a result, the data stored in the node that went offline may become stale.
After the node that went offline is restored, a resynchronization is performed using a checkpoint that represents a last known state of the storage node and incrementally rebuilding the node by applying the changes that occurred in the data storage system since the checkpoint was created. Such checkpoints are periodically created to reduce the amount of data to be resynchronized in the case of a failure of a node.
In a conventional storage system the data storage system is brought to a quiescent point prior to creating a checkpoint. As used herein, a quiescent point is state of a node where data is not changing. Before a quiescent point is achieved, all operations that are currently in flight (e.g., an operation that has arrived at the data storage system, but have not been applied to the node) are applied to the node. During a quiescent point, every active node in the data storage system contains the same data.
To achieve a quiescent point in a conventional storage system, updates to the data storage system that arrived after the initiation of the process to achieve a quiescent point are suspended. During this period of time, all the operations currently inflight for each of the nodes of the data storage system are flushed. As such, throughput is reduced during the checkpoint generation process.