In the field of computer storage systems, there is increasing demand for what have come to be described as “advanced functions”. Such functions go beyond the simple I/O functions of conventional storage controller systems. Advanced functions are well known in the art and depend on the control of metadata used to retain state data about the real or “user” data stored in the system. The manipulations available using advanced functions enable various actions to be applied quickly to virtual images of data, while leaving the real data available for use by user applications. One such well-known advanced function is FlashCopy.
At the highest level, FlashCopy is a function where a second image of ‘some data’ is made available. This function is sometimes known in other system contexts as Point-In-Time copy, or T0-copy. The second image's contents are initially identical to that of the first. The second image is made available ‘instantly’. In practical terms this means that the second image is made available in much less time than would be required to create a true, separate, physical copy, and that this means that it can be established without unacceptable disruption to a using application's operation. Once established, the second copy can be used for a number of purposes including performing backups, system trials and data mining. The first copy continues to be used for its original purpose by the original using application.
FlashCopy implementations achieve the illusion of the existence of a second image by redirecting read I/O addressed to the second image (henceforth Target) to the original image (henceforth Source), unless that region has been subject to a write. Where a region has been the subject of a write (to either Source or Target), to maintain the illusion that both Source and Target own their own copy of the data, a process is invoked which suspends the operation of the write command, and without it having taken effect, issues a read of the affected region from the Source, applies the read data to the Target with a write, then (and only if all steps were successful) releases the suspended write. Subsequent writes to the same region do not need to be suspended since the Target will already have its own copy of the data. This copy-on-write technique is well known and is used in many environments. All implementations of FlashCopy rely on a data structure which governs the decisions discussed above, namely, the decision as to whether reads received at the Target are issued to the Source or the Target, and the decision as to whether a write must be suspended to allow the copy-on-write to take place. The data structure essentially tracks the regions or grains of data that have been copied from source to target, as distinct from those that have not. In its simplest form, this data structure is maintained in the form of a bitmap showing which grains have been written to, and which are untouched by write activity.
Some storage controllers allow a user to configure more than one target for a given source. This has a number of applications. For instance, different experiments could be run against each of the targets. Or the targets might be taken at different times (e.g. different days in the week), and allow historical access to the disk, perhaps for the purpose of recovering from some data corruption, such as might be caused by a virus. Existing implementations of multiple target FlashCopy extend the FlashCopy algorithm, by configuring the disks as shown in FIG. 1, where A is a source LOGICAL UNIT, and B and C show two targets that were taken at some time in the past. A, B and C can each be updated. The arrows show grains (fixed sized regions of the disk) which are still directly dependent on the source LOGICAL UNIT. These have corresponding bits of ‘0b’ in the bitmap which tracks the progress of each FlashCopy. This conventional algorithm simply arranges the multiple targets in parallel, and operates the original FlashCopy algorithm over each mapping in turn. This has a drawback in that writes to a grain in the source might split multiple grains (one in each relationship). The last grain in the above example would be one. This causes extra processing and latency, and may limit the scalability of the multiple FlashCopy arrangement, and hence its usability.
It is possible to provide an arrangement which structures the FlashCopy mappings in a cascade and then functions to ensure that the number of writes needed for any host I/O is bounded at 2, regardless of the number of target disks in the cascade. Such an arrangement, however, does not allow for the situation in which a set of cascade relationships is broken by a disk failure (or other disk offline condition) somewhere in the cascade. Thus, one respect in which cascaded FlashCopy targets are inferior to the conventional scheme is that the data on a cascaded target is dependent on all the disks above it—not just the source as in the conventional scheme. If the source disk becomes inaccessible (the disk fails or otherwise goes offline) it is reasonable to expect the target disks to become inaccessible. However, if a target disk becomes inaccessible (the disk fails or otherwise goes offline) it is not reasonable for all other disks in the cascade to become inaccessible because this is contrary to the user's view of the copies being made.
A further refinement to the above scheme provides a storage controller having control components (which may be implemented in hardware, software or a combination of these) capable of applying rule-based logic to provide a system in which a cascade may be divided into a plurality of sequences, and in which original data from the source is preferentially cascaded to attempt to ensure preservation of a copy of the original data in at least one member of each sequence.
Using this refinement of the technique, if data is removed from the source disk of the cascade, in the majority of cases the data will be maintained on a number of downstream disks. The number of copies (N) of the data can be chosen from within the range N=2 to (number of disks in the cascade −1). In this way it may be guaranteed that if up to N−1 disks in the cascade (excluding the source of the cascade which is treated differently) become inaccessible the data for all the disks in the cascade can be extracted from the other disks. When a disk becomes inaccessible a recovery procedure is activated to ensure that multiple copies of the data are held on the remaining disks within the cascade. Thus, provided that too many multiple failures are not experienced in quick succession, the cascade will still be able to cope with a disk failure.
A problem then arises in that if a disk becomes inaccessible, even for a short length of time, it must be removed from the cascade. The disk may be complete at the point it became inaccessible.
However, the rest of the chain may have moved on because writes to other disks in the chain may have caused the original data for any given grain to move down the chain, and the data held on the offline disk is now redundant from the point of view of the remainder of the cascade.