Replication is typically employed as part of a data backup and recovery storage strategy and, as such, denotes the movement of data from a source storage space of a source domain to a target storage space of a target domain via a communications network (e.g., a computer network) in such a way that enables recovery of applications from the target storage space. As used herein, recovery denotes loading of the applications on possibly different hosts (e.g., computers) where they can access the target storage space, instead of the source storage space, resulting in the applications loaded to a valid state. Also, storage space denotes any storage medium having addresses that enable data to be accessed in a stable way and, as such, may apply to file system access, block access and any other storage access means.
The source domain contains at least the source storage space, but may also contain the hosts, a switching fabric and any source replication components situated outside of those components. In this context, a component may either be a physical entity (e.g., a special replication appliance) and/or software entity (e.g., a device driver). In remote disaster recovery, for example, the source domain includes an entire geographical site, but may likewise span multiple geographical sites. The target domain includes all of the remaining components relevant for replication services, including the target storage space. In addition, a replication facility includes components that may be located in both the source and target domains.
The replication facility typically has at least one component, i.e., a write interception component, which intercepts storage requests (e.g., write operations or “writes”) issued by a host to the source storage space, prior to sending the intercepted writes to the target storage space. The write interception component is typically embedded within a computing unit configured as a source replication node. When issuing a write, an application executing on the host specifies an address on the storage space, as well as the contents (i.e., write data) with which the storage space address is to be set. The write interception component may be implemented in various locations in the source domain depending on the actual replication service; such implementations may include, e.g., a device driver in the host, logic in the switching fabric, and a component within the source domain, e.g., a source storage system. The write interception component is typically located “in-band”, e.g., between the host and the source storage system, although there are environments in which the component may be located “out-of-band”, where a separate physical component, such as an appliance server, in the source domain receives duplicate writes by utilizing, e.g., an in-band splitter.
Synchronous replication is a replication service wherein a write is not acknowledged until the write data associated with the write is processed by the source storage space, propagated to the target domain and persistently stored on the target storage space. An advantage of synchronous replication is the currency of the target domain data; that is, at any point in time, the writes stored on the target domain are identical to the source domain. However a disadvantage of this replication service is the latency or propagation delay associated with communicating the writes to the target domain, which limits the synchronous replication service in terms of distance, performance and scalability.
An asynchronous replication service reduces such latency by requiring that the write only be processed by the source storage space without having to wait for persistent storage of the write on the target storage space. In other words, the write is acknowledged once its associated write data is processed by the source storage space; afterwards, the write (and write data) are propagated to the target domain. Thus, this replication service is not limited by distance, performance or scalability and, therefore, is often preferred over synchronous replication services. A disadvantage of the asynchronous replication service, though, is the possibility of incurring data loss should the source storage space fail before the write data has been propagated and stored on the target storage space.
For example, assume the source storage system is one of many independent (non-coordinated) storage systems that span various geographical locations of a source domain. Further, assume that a host application or multiple (coordinated) host applications issue writes to all of the source storage systems for storage on their storage spaces. These source storage spaces must be replicated consistently on a target domain such that, if a disaster arises, storage on the target domain can be recovered in a manner that maintains the order of writes issued to the source storage systems by the host(s).
Assume further that the replication service replicates writes consistently from the source storage systems to a plurality of target storage systems of the target domain. As a result, there may be a plurality of independent replication streams, e.g., one replication stream from a first source storage system to a first target storage system and one stream from a second source storage system to a second target storage system. These independent and non-coordinated replication streams are asynchronously replicated at a target storage space of the target domain. In the event of a disaster, a situation may arise where the first target storage system is recovered to a first, previous point in time and the second target storage system is recovered to a second, previous point in time. Accordingly, the aggregated content of the target storage space on the target domain is corrupted. The present invention is directed, in part, to solving this problem by enabling synchronization among the target storage spaces.
Often, a source domain having multiple hosts and/or multiple source storage systems may include only one source replication node (i.e., one write interception component) configured to intercept all writes associated with a consistency group. As used herein, a consistency group comprises storage space that requires consistent replication at a target domain. For example, assume that a large data center is configured with many source storage systems configured to serve many hosts, wherein the source storage systems cooperate to maintain a consistency group. If all write traffic is directed to the single write interception component, a substantial scalability issue arises because the interception component will not practically be able to sustain the entire traffic.
Now assume that a consistency group is configured to span multiple geographical site locations such as, e.g., among several small data centers geographically dispersed throughout a country or a plurality of countries. Here, the main reason for not using a single write interception component is not necessarily the scalability issue as much as the substantial latency introduced by such a configuration. This may necessitate either use of smaller consistency groups, which facilitates reliable and consistent group recovery on the target domain, or acceptance of large latencies and performance impact, which is undesirable. Therefore, such configurations may dictate the use of multiple write interception components.
Yet, certain prior replication solutions such as, e.g., write-level ordering asynchronous replication solutions, have been generally unable to accommodate configurations employing multiple write interception components. A possible exception is the XRC Asynchronous Replication service available from IBM Corporation, which ensures synchronization among multiple write interception components through the use of a fine grained, extremely accurate, hardware-based global clock facility. The XRC service uses a dedicated, hardware mechanism to realize such an accurate global clock and, as such, is generally tailored to mainframe computers. That is, the ability to set a time that is extremely accurate is guaranteed by the hardware mechanism built into mainframe technology. Such a mechanism is expensive and generally not deployable by systems running open, general-purpose operating systems. Furthermore, such mainframe technology may not be practically deployed in distributed environments because of latency issues, thereby rendering the hardware mechanism ineffective when servicing a consistency group that spans multiple geographical sites.