Many applications maintain persistent state information by storing data on disk. Often the data stored on disk is designed to allow an application to return to the same state after an unexpected restart. The ability of an application to reliably return to a previous state often depends on data being stored to disk in a specific order. In order to protect against data loss and business interruption due to disasters, application data is often replicated to a geographically remote site. Ideally the remote location is far enough away from the primary data center to ensure that a single disaster will not be able to destroy both data centers. In the event of a disaster, the remote copy of data can be used to either reconstruct a new primary data center or restart the affected applications at the remote location itself. In order for an application to be restarted at the remote site and return to its pre-failure state, data must be copied to the remote site in the appropriate order.
More particularly, to ensure that they can return to the same state, applications strictly control the order in which state information is written to disk. Typically, I/O requests to store new state information to disk are not issued until I/O operations to store previous state information have completed. Such write operations are said to be dependent on the previous write requests. Applications rely on this explicit control of dependent write ordering to ensure that there will be no gaps or misordering of the state information stored on disk. In order to guarantee that this strict write ordering occurs, disk storage systems must store write data to disk in the order that it is received. Furthermore, where remote copies of data are maintained (“remote replication”), the same write ordering restrictions exist. Some advanced storage systems are capable of performing remote replication automatically in a manner transparent to applications. Such solutions relieve critical applications from the burden of managing the remote data copy and allow them to focus on performing their particular business function.
At present, there are two primary methods to reliably maintain a remote copy suitable for application restart; synchronous and semi-synchronous remote replication. In accordance with the synchronous remote replication method, each write received is simultaneously applied to both the local disks and the remote disks. In order to ensure correct ordering of dependent writes, storage systems typically only allow one write to occur at a time and do not complete a write operation until the remote copy has been updated. Since write requests are not completed until the remote copy has been updated, the average latency of each write operation is increased to the time required to update the remote copy. That amount of time depends on, amongst other things, the geographic distance between the source of the request and the remote system, as well as the speed of the link between the two. Generally, the greater the distance, the longer the latency. This increased latency combined with the serial restriction needed to ensure the correct ordering of dependent writes can have a significant impact on application performance. As a result, it is difficult to construct geographically diverse disaster recover solutions using a synchronous replication solution while maintaining acceptable application performance.
In accordance with the semi-synchronous remote replication method, write operations are allowed to complete locally before the remote copy has been updated. Doing so decouples the application from the latency of updating the remote copy and thereby attempts to avoid the associated performance penalties. However, in order to ensure that the remote copy remains consistent, the writes must still be applied to the remote copy in the order that they were received. Typically storage systems accomplish this by storing writes that need to be applied to the remote copy in a queue. Sometimes, to control how out of date the remote copy gets, a maximum length for this queue is defined that, when reached, causes the replication to fall back to a synchronous behavior. When this happens, application performance is negatively impacted just as it would with a purely synchronous solution.
While semi-synchronous solutions offer better performance than synchronous ones, they can still result in a stricter than necessary ordering of writes. In general, not every write issued by an application is a dependent one. Therefore there are some writes that could be allowed to complete in parallel. In practice, it is difficult for storage systems to distinguish between dependent and non-dependent writes. Therefore, semi-synchronous solutions must default to ordering all writes in order to maintain correctness. In doing so, however, the overly strict serialization of writes that this causes may lead to the ordering queue quickly reaching its maximum length and the application performance degradations that result.
Both the synchronous and semi-synchronous solutions negatively impact application performance due to their strict serialization of writes. There is a need for an improved remote replication solution to allow better application performance while guaranteeing that the remote copy of application data remains consistent with the original, to ensure that the remote site can be used for application restart and failover in the event of a disaster.