Storage replication is a data protection strategy in which data objects (e.g., files, physical volumes, logical volumes, file systems, etc.) are replicated to provide some measure of redundancy. Storage replication may be used for many purposes, such as ensuring data availability upon storage failures, site disasters, or planned maintenance. Storage replication also may be used for purposes other than ensuring data availability. For example, workloads may be directed to a replica of a data object rather than to the primary data object.
Often, storage replication methods are designed to support a constraint known as a recovery point objective (RPO) that typically specifies an upper limit on the potential data loss upon a failure or disaster. An RPO can be specified in terms of time, write operations, amount of data changed, and the like. For example, if an RPO for a certain set of data objects is specified as twenty-four hours, then a storage replication method designed to support this RPO would need to replicate such a set of data objects at least every twenty-four hours. This particular method replicates data object contents in such a way that the RPO for each data object is met.
In a typical storage replication scenario, there is one primary copy of the data object and one or more replicas of the data object. According to one storage replication method, an immutable image of the primary data object is created and then the “dirty regions” of the immutable object image are copied to the corresponding replica. “Dirty regions” refer to portions of the primary data object that have been modified since the last point at which the said portions of the data object have been copied to the replica. The creation of an immutable image of the primary data object is needed so that the copying of dirty regions results in a consistent data object replica. This cycle of creating an immutable object image and copying the dirty regions to the replica of the data object is repeated with a frequency that satisfies the RPO for the corresponding data set.
Creating an immutable object image, however, requires overhead in terms of computations, I/O performance, storage space, or all of the above. Consequently, storage replication methods are often designed to create an immutable object image and copy dirty regions infrequently, thus resulting in conservative (long) RPOs. Also, the immutable object image is usually created just in time to meet the RPO for the data object. Thus, the dirty regions are transferred during a narrow time window, and this makes it difficult to make efficient use of resources, such as CPU, disk bandwidth, network bandwidth, etc.