In the field of computer storage systems, there is increasing demand for what have come to be described as “advanced functions”. Such functions go beyond the simple I/O functions of conventional storage controller systems. Advanced functions are well known in the art and depend on the control of metadata used to retain state data about the real or “user” data stored in the system. The manipulations available using advanced functions enable various actions to be applied quickly to virtual images of data, while leaving the real data available for use by user applications. One such well-known advanced function is FlashCopy.
At the highest level, FlashCopy is a function where a second image of ‘some data’ is made available. This function is sometimes known in other system contexts as Point-In-Time copy, or T0-copy. The second image's contents are initially identical to that of the first. The second image is made available ‘instantly’. In practical terms this means that the second image is made available in much less time than would be required to create a true, separate, physical copy, and that this means that it can be established without unacceptable disruption to a using application's operation.
Once established, the second copy can be used for a number of purposes including performing backups, system trials and data mining. The first copy continues to be used for its original purpose by the original using application. Contrast this with backup without FlashCopy, where the application must be shut down, and the backup taken, before the application can be restarted again. It is becoming increasingly difficult to find time windows where an application is sufficiently idle to be shut down. The cost of taking a backup is increasing. There is thus significant and increasing business value in the ability of FlashCopy to allow backups to be taken without stopping the business.
FlashCopy implementations achieve the illusion of the existence of a second image by redirecting read I/O addressed to the second image (henceforth Target) to the original image (henceforth Source), unless that region has been subject to a write. Where a region has been the subject of a write (to either Source or Target), then to maintain the illusion that both Source and Target own their own copy of the data, a process is invoked which suspends the operation of the write command, and without it having taken effect, issues a read of the affected region from the Source, applies the read data to the Target with a write, then (and only if all steps were successful) releases the suspended write. Subsequent writes to the same region do not need to be suspended since the Target will already have its own copy of the data. This copy-on-write technique is well known and is used in many environments.
All implementations of FlashCopy rely on a data structure which governs the decisions discussed above, namely, the decision as to whether reads received at the Target are issued to the Source or the Target, and the decision as to whether a write must be suspended to allow the copy-on-write to take place. The data structure essentially tracks the regions or grains of data that have been copied from source to target, as distinct from those that have not. In its simplest form, this data structure is maintained in the form of a bitmap showing which grains have been written to, and which are untouched by write activity.
Some storage controllers allow a user to configure more than one target for a given source. This has a number of applications. For instance, different experiments could be run against each of the targets. Or the targets might be taken at different times (e.g. different days in the week), and allow historical access to the disk, perhaps for the purpose of recovering from some data corruption, such as might be caused by a virus.
Existing implementations of multiple target FlashCopy extend the FlashCopy algorithm, by configuring the disks as shown in FIG. 1, where A is a source LOGICAL UNIT, and B and C show two targets that were taken at some time in the past. A, B and C can each be updated. The arrows show grains (fixed sized regions of the disk) which are still dependent on the source LOGICAL UNIT. These have corresponding bits of ‘0b’ in the bitmap which tracks the progress of each FlashCopy.
This conventional algorithm simply arranges the multiple targets in parallel, and operates the original FlashCopy algorithm over each mapping in turn.
However, there is a drawback with the conventional algorithm for managing a multiple FlashCopy arrangement in that writes to a grain in the source might cause copying of multiple grains (one in each relationship). The last grain in the above example would be one. This causes extra processing and latency, and limits the scalability of the prior art multiple FlashCopy arrangement, and hence its usability. Accordingly, there is a need for a solution that mitigates processing and latency associated with management of a multiple FlashCopy arrangement.