The present disclosure relates to replicating data in data storage systems, and more particularly to preventing inconsistent backup copies of data volumes following node failure in data storage systems having multiple nodes at each storage site.
In data storage systems, it is often useful to have stored data replicated in multiple locations so that the data is backed up and available locally in each of the locations. Each location will have a local data storage device, which can satisfy requests to read data independently. However, requests to write data need to be distributed to each location, so that they can be applied in a consistent fashion. In particular, if multiple write requests are made to a particular region of storage, such as a block, sector or page of data in the storage, the writes must be applied in the same order by each local data storage device, otherwise the data stored it each local data storage device will be inconsistent. When write requests are received, which potentially may not be applied in the same order on different local storage devices, this is known as a “write collision”.
A known solution to write collisions is to use one location to process write requests made to any of the locations, and distribute the results of that processing to the other locations, so that the data in each location is consistent. However, this means that for any location other than the location that processes the write requests, the time taken to complete a write request will be at least two times the round-trip-time between the locations.