Data is often one of the most valuable assets of an organization, and thus the ability for an organization to reliably access its data is of paramount importance. To protect against logical corruption of the data, or loss of data due to threats such as malware or software bugs, it should be ensured that a viable copy of the data is available at all times. Regularly scheduled backups are one tool used to protect against loss of data. However, it takes time to restore lost data from a backup, and the recovered data is only as current as the time at which the most recent backup was taken. An important tool to provide reliable access to current data is data replication, in which a physical or logical storage mechanism (e.g., device, volume, file system, etc.) is replicated by maintaining a current duplicate of the primary storage mechanism as updates are made, typically on separate hardware and often at a remote physical location. Storage replication can be utilized in the context of clustering technology, which protects against component and application failure, in order to maintain high availability of the IT infrastructure used to access data.
Clusters are groups of computers that use groups of redundant computing resources in order to provide continued service when individual system components fail. Clusters eliminate single points of failure by providing multiple servers, multiple network connections, redundant data storage, etc. Clustering systems are often combined with storage management products that provide additional useful features, such as journaling file systems, logical volume management, etc.
Where a cluster is implemented in conjunction with a storage management environment, the computer systems (nodes) of the cluster can access shared storage, such that the shared storage looks the same to each node. The shared storage is typically implemented with multiple underlying physical storage devices, which are managed by the clustering and storage system so as to appear as a single storage device to the nodes of the cluster. The multiple physical storage media can be grouped into a single logical unit which is referred to as a LUN (for “logical unit number”), and appears as a single storage device to an accessing node. Shared storage can be replicated at the LUN level, or at the level of individual logical or physical storage devices.
The management of underlying physical storage devices can also involve software level logical volume management, in which multiple physical storage devices are made to appear as a single logical volume to accessing nodes. A logical volume can be constructed from multiple physical storage devices directly, or on top of a LUN, which is in turn logically constructed from multiple physical storage devices. A volume manager can concatenate, stripe together or otherwise combine underlying physical partitions into larger, virtual ones. In a clustering environment, a cluster volume manager extends volume management across the multiple nodes of a cluster, such that each node recognizes the same logical volume layout, and the same state of all volume resources at all nodes. Data volumes can also be replicated, for example over a network to a remote site. Volume replication enables continuous data replication from a primary site to a secondary site, for disaster recovery or off host processing.
A primary storage destination can also be replicated in a variety of other contexts, such an enterprise level network with a plurality of computers accessing a physical or logical storage device, at a file system level, or at the level of a virtual disk file accessed by one or more virtual machines (VMs). When providing replication of data storage device(s) such as one or more physical, logical or virtual disks or one or more volumes, it is important to preserve the order of write operations to both the primary and replicated storage (this is known as maintaining dependent-write consistency). For example, if computer A first executes write operation 1 to the storage and subsequently computer B executes write operation 2, write operation 2 is dependent on write operation 1. The write operations must be executed in that order on both the replicated storage as well as the primary storage, or the data integrity will be corrupted.
Whenever multiple computers are conducting separate input/output (I/O) operations to one or more common storage device(s) that is being replicated, dependent-write consistency must be maintained on the replicated storage. Typically, the replication of the storage I/O operations (e.g., writes, reads, seeks, mounts, etc.) to the primary storage destination is processed by a replication manager or appliance (often within the context of the storage management environment). The replication appliance receives the I/O operations that are executed by each computer to the primary storage device, and executes the I/O operations to the replicated storage. Because the replication level processing of the storage I/O operations originating from the multiple devices can be asynchronous, the I/O operations received by the replication appliance can represent slightly different points in time, such that simply executing the operations to the replicated device in the order received could violate dependent-write consistency and corrupt the data on the replicated device. To address this, a snapshot or “consistent cut” capturing the ordered state of the I/O operations from across the multiple computers is taken periodically, to identify a point in time in the replication stream, accounting for dependent-write consistency. Conventionally, to take a consistent cut preserving dependent-write consistency in storage replication for a group of multiple computers, the storage I/O on the multiple computers is blocked. While the storage I/O is blocked, a consistency marker is added to the storage I/O stream of each computer. These consistency markers are used by the replication appliance to determine the order of write operations made by the multiple computers, and take a consistent cut preserving write order dependency. This works, but requires blocking I/O which causes delays. The more computers in the group the longer the delay that results from blocking the I/O while the consistent cut is being taken.
It would be desirable to address these issues.