Information technology systems, including storage systems, may need protection from site disasters or outages, where outages may be planned or unplanned. Furthermore, information technology systems may require features for data migration, data backup, or data duplication. Implementations for disaster or outage recovery, data migration, data backup and data duplication may include mirroring or copying of data in storage systems. Such mirroring or copying of data may involve interactions among hosts, storage systems and connecting networking components of the information technology system.
An enterprise storage server (ESS), such as the IBM* TotalStorage Enterprise Storage Server*, may be a disk storage server that includes one or more processors coupled to storage devices, including high capacity scalable storage devices, Redundant Array of Independent Disks (RAID), etc. The enterprise storage servers are connected to a network and include features for copying data in storage systems.
Peer-to-Peer Remote Copy (PPRC) is an ESS function that allows the shadowing of application system data from a first site to a second site. The first site may be referred to as an application site, a local site, or a primary site. The second site may be referred to as a recovery site, a remote site or a secondary site. The logical volumes that hold the data in the ESS at the local site are called local volumes, and the corresponding logical volumes that hold the mirrored data at the remote site are called remote volumes. High speed links, such as ESCON links may connect the local and remote ESS systems.
In the synchronous type of operation for PPRC, i.e., synchronous PPRC, the updates done my a host application to the local volumes at the local site are synchronously shadowed onto the remote volumes at the remote site. As synchronous PPRC is a synchronous copying solution, write updates are ensured on both copies (local and remote) before the write is considered to be completed for the host application. In synchronous PPRC the host application does not get the “write complete” condition until the update is synchronously done in both the local and the remote volumes. Therefore, from the perspective of the host application the data at the remote volumes at the remote site is equivalent to the data at the local volumes at the local site.
Synchronous PPRC increases the response time as compared to asynchronous copy operation, and this is inherent to the synchronous operation. The overhead comes from the additional steps that are executed before the write operation is signaled as completed to the host application. Also, the PPRC activity between the local site and the remote site will be comprised of signals and data that travel through the links that connect the sites, and the overhead response time of the host application write operations will increase proportionally with the distance between the sites. Therefore, the distance affects a host application's response time. In certain implementations, there may be a maximum supported distance for synchronous PPRC operations referred to as the synchronous communication distance.
In the Extended Distance PPRC (also referred to as PPRC Extended Distance) method of operation, PPRC mirrors the updates of the local volume onto the remote volumes in an asynchronous manner, while the host application is running. In Extended Distance PPRC, the host application receives a write complete response before the update is copied from the local volumes to the remote volumes. In this way, when in Extended Distance PPRC, a host application's write operations are free of the typical synchronous overheads. Therefore, Extended Distance PPRC is suitable for remote copy solutions at very long distances with minimal impact on host applications. There is no overhead penalty upon the host application's write such as in synchronous PPRC. However, Extended Distance PPRC does not continuously maintain an equivalent copy of the local data at the remote site.
Further details of the PPRC are described in the IBM publication “IBM TotalStorage Enterprise Storage Server: PPRC Extended Distance,” IBM document number SG24-6568-00 (Copyright IBM, 2002), which publication is incorporated herein by reference in its entirety.
Additional flexibility and safety in data storage can be achieved by combining synchronous PPRC and asynchronous Extended Distance PPRC elements in a single data storage system. Once such system is disclosed in co-pending and commonly assigned U.S. patent application Ser. No. 10/464,024, filed Jun. 17, 2003 entitled, “Method, System, and Article of Manufacture for Remote Copying of Data” which application is incorporated herein by reference in its entirety. The cascading data storage system described in U.S. patent application Ser. No. 10/464,024 features a first storage unit receiving data from the I/O operations of a host computer. A first storage controller is associated with the first storage unit which synchronously mirrors the data to a second storage unit associated with a second storage controller, which in turn asynchronously mirrors the data to a third storage unit. Typically, the first, second and third storage units are maintained at separate locations. It is common for the first storage unit to be maintained at the main application site. The second storage unit is often maintained at a bunker site near enough to the first storage unit to maintain an efficient synchronous PPRC relationship, but separated and protected from the first storage unit in order to decrease the chance that the first and second storage units would both be destroyed in a common disaster. The third storage unit can be located at any distance from the second storage unit.
As is discussed in U.S. application Ser. No. 10/464,024, return to full operation at the first storage unit after a failure can be accomplished by performing a full copy of all volumes maintained on the second or third storage units to the first storage unit. Unfortunately, a full volume copy may take hours depending upon the amount of data stored in the respective storage units. Therefore, a need exists in the art for a recovery method and apparatus that can be implemented that avoids the need for full copies of volumes to restore the configuration back to normal operation.
The present invention is directed to overcoming one or more of the problems discussed above.