Data is at the heart of every enterprise, and is a core component of a data center infrastructure. As data applications become more and more critical, there is a growing need for disaster recovery systems to support application deployment, and provide complete business continuity.
Disaster recovery systems are responsible for data protection and application recovery. Some disaster recovery systems provide continuous data protection, and allow recovery to any point-in-time.
Some disaster recovery systems provide built-in test capabilities, which enable an administrator to test recovery to a previous point in time. When a previous point in time is selected for testing by a disaster recovery system, a disk image is presented to the enterprise data applications, as the disk image existed at the previous point in time. All reads from the disk are directed to the disaster recovery system, which determines where the data for the previous point in time is located—on a replica, or on a redo journal. All writes to the disk are recorded in a separate redo log, to be able to erase them after the test is complete.
There are many advantages to testing a previous point-in-time image, including ensuring that a replica is usable, and finding a point-in-time for recovery prior to a disaster. In a case where data became corrupted at an unknown time, it is of advantage to find a previous point in time as close as possible to the time of corruption, at which the disk image was uncorrupted, in order to minimize loss of data after recovery.
Objectives of disaster recovery plans are generally formulated in terms of a recovery time objective (RTO). RTO is the time it takes to get a non-functional system back on-line, and indicates how fast the organization will be up and running after a disaster. Specifically, RTO is the duration of time within which a business process must be restored after a disaster, in order to avoid unacceptable consequences associated with a break in business continuity. Searching for an appropriate point-in-time prior to failover generally requires testing multiple disk images at different points-in-time, which itself requires a long time to complete and significantly increases the RTO.
In addition, testing multiple disk images generally requires a complete copy of the data. As such, if a disk image is 2 TB and three points in time are to be tested, the storage consumption is at least 8 TB, corresponding to three tests and the replica's gold copy. This drawback makes it costly and impractical to test multiple disk copies in parallel.
It would thus be of advantage to expose multiple disk images at different points in time, as offsets from a gold image, to enable testing in parallel and then selecting a disk image for failover without duplication of data, to support the enterprise RTO.