The present application relates generally to an improved data processing apparatus and method and more specifically to mechanisms for replicating data in a data storage system.
In data storage systems, it is often desirable to have stored data replicated in multiple locations, so that the data is available locally in each of the locations. Each location will have a local data storage device, which can satisfy requests to read data on its own, i.e. without needing to query other data storage devices of the data storage system. However, requests to write data need to be distributed to each location, so that they can be applied in a consistent fashion. In particular, if multiple write requests are made to a particular region of storage, such as a block, sector or page of data in the storage, the writes must be applied in the same order by each local data storage device, otherwise the data stored in each local data storage device will not be the same. When write requests are received that could potentially not be applied in the same order on different local data storage devices, this is known as a “write collision”.
With such systems, it can also be desirable to have data replicated in a location by a data storage that does not itself receive any requests to write data (other than those required to keep the data synchronised with the other locations). Such a data storage device may be used during migration from one location to another, for example, or to provide a backup in case one of the data storage devices that receives write requests fails.
A naïve solution to this problem would be to forward all write requests to a single data storage device, and coordinate all updates through that site. However, a drawback of this is that the write latency will be significantly higher for write requests received by data storage devices other than the one performing coordination, most likely twice as high.
U.S. Pat. No. 8,868,857 B2, published 21 Oct. 2014, discloses a method of managing remote data replication in which an index generator generates an ordered index of writes made to replicated data storage.