In enterprise data centers, data replication is commonly an important IT operation that ensures continuity in case of data loss and/or device failures. In such a context, a primary site maintains an original copy of data, while a secondary site is established comprising one or more storage devices to maintain the replica of portions (or all) of the data maintained at the primary site.
Typical data replication approaches can be categorized into two classes: (i) synchronous replication and (ii) asynchronous replication. Synchronous replication approaches require that both the primary site and the secondary site commit the input/output (I/O) operation before an I/O success acknowledgement is sent back to the host. In contrast, in asynchronous replication approaches, the primary site will send an I/O success acknowledgment back to the host immediately after a local commitment, and the primary site will additionally synchronize with the secondary site in regards to the new I/O change afterwards. Therefore, synchronous replication schemes are well-suited for scenarios wherein the network latency between the primary site and the secondary site is small, whereas asynchronous replications are commonly used for long distance secondary sites over wide area networks (WAN) or the Internet.
For many applications, the order of the I/O requests is a separate important factor that must be preserved for application-level data consistency. One example scenario includes a database application which writes data (for example, table updating) in one storage volume, and writes logs in another storage volume. For application-level data consistency, the database application needs to first write to the log volume before writing the actual data into the data volume, and such an I/O order must be preserved at the replicas located in the secondary site. In other words, during data replication, the I/O must be applied on the log replica volume first, before performing the write I/O on the data replica volume, in order to satisfy the specific application data consistency requirement.
One example existing approach for keeping track of the order of I/O requests includes creating a “consistency group,” which includes multiple storage volumes (in the primary site), and the data replication goal for this “consistency group” is to duplicate the I/O changes to the storage devices as well as the order of such changes to the replicas in a secondary site. However, for asynchronous data replications, for example, preserving such an I/O order across multiple storage volumes presents challenges. The challenges are further complicated in instances wherein the primary site uses transmission control protocol/internet protocol (TCP/IP) based protocols such as Internet Small Computer System Interface (iSCSI) and Fiber Channel over IP (FCIP) to transmit data replication over a WAN such as the Internet, wherein TCP/IP packets (containing I/O commands) that are sequentially ordered at the primary site can arrive at the secondary site out of order due to multi-path routing and the unreliable nature of the Internet.
Accordingly, a need exists for I/O order preservation techniques in connection with data replication.