Storage replication is a data protection strategy in which data objects (e.g., files, physical volumes, logical volumes, file systems, etc.) are replicated to provide some measure of redundancy. Storage replication may be used for many purposes, such as ensuring data availability upon storage failures, site disasters, or planned maintenance. Storage replication also may be used for purposes other than ensuring data availability. For example, workloads may be directed to a replica of a data object rather than to the primary data object.
Often, storage replication methods are designed to support a constraint known as a recovery point objective (RPO) that typically specifies an upper limit on the potential data loss upon a failure or disaster. An RPO can be specified in terms of time, write operations, amount of data changed, and the like. For example, if an RPO for a certain set of data objects is specified as twenty-four hours, then a storage replication method designed to support this RPO would need to replicate such a set of data objects at least every twenty-four hours. This particular method replicates data object contents in such a way that the RPO for each data object is met.
In a typical storage replication scenario, there is one primary copy of the data object and one or more replicas of the data object. According to one storage replication method, contents of data objects are copied from the primary copy to a replica copy over some network interconnect. Certain implementations copy only portions of a data object that have been modified since the last time the data object was copied. The copying of the contents of one or more data objects is called a replication event. Replication events are typically repeated at scheduled points in time, so that the RPO for the corresponding data objects is satisfied. Scheduling replication events for different groups of data objects independently may result in poor utilization of network bandwidth, unpredictable times for the completion of the copy and failure to achieve the required RPO. Thus, it is desirable to schedule replication events in a way that satisfies the RPO requirements for all data objects while minimizing the total network bandwidth used over time. While intelligent scheduling of replication events may improve network bandwidth utilization, it may not be possible to satisfy the RPO for all data objects over time given the available network bandwidth. In those cases, it is desirable to schedule replication events in a manner that minimizes the impact and severity of RPO violations.