Embodiments of the invention relate to maintaining multiple target copies.
Computing systems often include one or more production computers (for processing data and running application programs, direct access storage devices (DASDs) for storing data, and a storage controller for controlling the transfer of data between the production computers and the DASD. Storage controllers, also referred to as control units or storage directors, manage access to a storage space comprised of numerous hard disk drives connected in a loop architecture, otherwise referred to as a Direct Access Storage Device (DASD). Production computers may communicate Input/Output (I/O) requests to the storage space through the storage controller.
The storage of data in large organizations is important, both for reliability of the data and for the ability to recover data in the event of any hardware failure. Storage Area Network (SAN) is an architecture that is used when very large amounts of data are needed to be stored in a reliable and secure manner. This technology allows networks to be created that support the attachment of remote computer storage devices such as disk arrays to servers in such a way that, to the operating system, the devices appear as locally attached. It is common in these networks to include a large amount of redundancy, both in the data storage and in the hardware connections between the individual components. Various techniques exist for creating data redundancy.
In many systems, a data volume on one storage device, such as a DASD, may be copied to the same or another storage device. A point-in-time copy involves physically copying all the data from source volumes to target volumes so that the target volume has a copy of the data as of a point-in-time. A point-in-time copy can also be made by logically making a copy of the data and then only copying data over when necessary, in effect deferring the physical copying. This logical copy operation is performed to minimize the time during which the target and source volumes are inaccessible.
A number of direct access storage device (DASD) subsystems are capable of performing “instant virtual copy” operations, also referred to as “fast replicate functions.” Instant virtual copy operations work by modifying metadata, such as relationship tables or pointers, to treat a source data object as both the original and copy. In response to a production computer's copy request, the storage subsystem immediately reports creation of the copy without having made any physical copy of the data. Only a “virtual” copy has been created, and the absence of an additional physical copy is completely unknown to the production computer.
Later, when the storage system receives updates to the original or copy, the updates are stored separately and cross-referenced to the updated data object only. At this point, the original and copy data objects begin to diverge. The initial benefit is that the instant virtual copy occurs almost instantaneously, completing much faster than a normal physical copy operation. This frees the production computer and storage subsystem to perform other tasks. The production computer or storage subsystem may even proceed to create an actual, physical copy of the original data object during background processing, or at another time.
One such instant virtual copy operation is known as a FLASHCOPY® operation. (FLASHCOPY is a registered trademark or common law mark of International Business Machines Corporation in the United States and/or other countries.) A FLASHCOPY® operation involves establishing a logical point-in-time relationship between source and target volumes on the same or different devices.
Instant virtual copy techniques, such as a FLASHCOPY® operation, provide a point-in-time copy tool. Thus, an instant virtual copy may be described as an instant snapshot of a data set or volume.
For example, a function such as a FLASHCOPY® function enables an administrator to make point-in-time, full volume copies of data, with the copies immediately available for read or write access. The FLASHCOPY® function can be used with standard backup tools that are available in the environment to create backup copies on tape. A FLASHCOPY® function creates a copy of a source volume on a target volume. This copy, as mentioned above, is called a point-in-time copy. When a FLASHCOPY® operation is initiated, a relationship is created between a source volume and target volume. This relationship is a “mapping” of the source volume and the target volume. This mapping allows a point-in-time copy of that source volume to be copied to the associated target volume. The relationship exists between this volume pair from the time that the FLASHCOPY® operation is initiated until the storage unit copies all data from the source volume to the target volume, or the relationship is deleted.
When the data is physically copied, a background process copies tracks from the source volume to the target volume. The amount of time that it takes to complete the background copy depends on various criteria, such as the amount of data being copied, the number of background copy processes that are running and any other activities that are presently occurring. The FLASHCOPY® function works in that the data which is being copied does not actually need to be copied instantaneously, it only needs to be copied just prior to an update causing on overwrite of any old data on the source volume. So, as data changes on the source volume, the original data is copied to the target volume before being overwritten on the source volume.
Therefore, a FLASHCOPY® is a feature supported on various storage devices that allows a user or an automated process to make nearly instantaneous copies of entire logical volumes of data. A copy of a source disk is made on a target disk. The copies are immediately available for both read and write access. A common feature of FLASHCOPY® like implementations is the ability to reverse the copy. That is, to populate the source disk of a FLASHCOPY® map with the contents of the target disk.
Multiple target instant virtual copy is a common copy service feature in storage controllers enabling the user to create many point-in-time images (i.e., instant virtual copies) of a production volume. These volumes may be used for restoring data when there is data corruption or when data has been changed due to test and development activities. The reasons for using such a feature are many and the number copies required is growing. High level features, such as near Continuous Data Protection (CDP) or Golden Image are also available. CDP may be described as a backup of data that automatically saves a copy of every change made to that data. An original version of data (e.g., an application) may be referred to as a Golden Image, and multiple copies may be made from this Golden Image.
Some basic techniques that may be adopted to implement instant virtual copy functionality include copy-on-write and redirect-on-write. Both copy-on-write and redirect-on-write track an image's data.
With one approach to maintaining many images with copy-on-write, the more images that need to be supported, the more overhead in terms of writes required for image maintenance due to production computer write activity to the source or production volume.