Data is at the heart of every enterprise, and a core component of data center infrastructure. As data applications become more and more critical, there is a growing need to ensure complete business continuity.
Disaster recovery systems provide data protection and application recovery. Some disaster recovery systems use virtual data replication within a hypervisor architecture, and are able to recover any point in time.
Objectives of disaster recovery plans are generally formulated in terms of recovery point objective (RPO) and recovery time objective (RTO).
RPO is a point in time to which data must be recovered. RPO indicates an amount of data that an enterprise determines is an acceptable loss in a disaster situation. RPO allows an enterprise to define a window of time before a disaster during which data may be lost. The value of the data in this window of time may be weighed against the cost of the additional disaster prevention measures that would be necessary to shorten the window.
RTO is the time it takes to get a non-functional system back on-line, and indicates how fast the enterprise will be up and running after a disaster. Specifically, RTO is the duration of time within which a business process must be restored after a disaster, in order to avoid unacceptable consequences associated with a break in business continuity. Most disaster recovery systems provide RTOs on the order of several hours.
RPO is independent of RTO. If the RPO of an enterprise is two hours, then when a system is brought back on-line after a disaster, all data must be restored to a point within two hours before the disaster. But the enterprise has acknowledged that data in the two hours immediately preceding the disaster may be lost; i.e., the acceptable loss window is two hours.
Conventional disaster recovery systems are single-RPO systems; i.e., a single RPO objective applies. In this regard, reference is made to FIG. 1, which is a screen shot of a prior art single-RPO console for a disaster recovery application. As shown in FIG. 1, a single RPO objective designates an RPO threshold time of 2 minutes, and a maintenance history of 4 hours.
An enterprise may share its overall bandwidth between its actual production system, and its disaster recovery system. A RPO may be controlled by the amount of bandwidth of the overall enterprise system allocated to the disaster recovery system. By allocating more bandwidth to the disaster recovery system, the RPO may be improved to reduce the window of data loss in case of disaster, but less bandwidth is then available for the enterprise production system. Conversely, by allocating less bandwidth to the disaster recovery system, the RPO is degraded to increase the window of data loss in case of disaster, and more bandwidth is then available for the enterprise production system.
When a disaster recovery system falls short of its RPO objective, it issues an RPO alert. If an administrator receives a series of RPO alerts, it generally means that the bandwidth allocated to the disaster recovery system is insufficient for the RPO objective, and the RPO objective must be relaxed or additional bandwidth must be allocated. Generally, allocating additional bandwidth to the disaster recovery system entails the expense of obtaining additional bandwidth for the overall enterprise system, in order not to degrade the enterprise production system.
Enterprise production systems generate data at varying rates, according to peaks times and off-peak times. Peak times and off-peak times may be different for different applications. E.g., a customer relationship management (CRM) application may have peak times during normal working hours and off-peak times during nights and weekends, whereas a fast-food ordering system may have peak hours during nights and weekends. Maintaining a low RPO at all times requires a WAN link with a bandwidth that accommodates the peak rates. Such lines are expensive, and are fully utilized only during relatively short time periods.
Cloud service disaster recovery systems support multiple enterprises. Maintaining high bandwidth lines for all enterprises is expensive, and does not fully utilize the lines most of the time. Reducing the bandwidth results in system alerts when the RPO is exceeded—alerts that the system administrator wants to avoid. Increasing the RPO results in missing alerts that should be issued.
It would thus be of advantage to enable a disaster recovery system to designate different RPO objectives based on the day and the time of day, with different respective bandwidths allocated to the disaster recovery system.