This invention relates to a method of, and system for, performing a data write on a storage device. The invention, in one embodiment, provides a mechanism to allow storage subsystems to take part in transactional rollbacks.
The storage of data in large organisations is of fundamental importance, both for reliability of the data and for the ability to recover data in the event of any hardware failure. Storage area network (SAN) is an architecture that is used when very large amounts of data are needed to be stored in a reliable and secure manner. This technology allows networks to be created that support the attachment of remote computer storage devices such as disk arrays to servers in such a way that, to the operating system, the devices appear as locally attached. It is common in these networks to include a large amount of redundancy, both in the data storage and in the hardware connections between the individual components.
Various methods exist for creating data redundancy. For example, a function such as the flashcopy function enables an administrator to make point-in-time, full volume copies of data, with the copies immediately available for read or write access. The flashcopy can be used with standard backup tools that are available in your environment to create backup copies on tape. Flashcopy creates a copy of a source volume on a target volume. This copy is called a point-in-time copy.
When a flashcopy operation is initiated, a relationship is created between a source volume and target volume. This relationship is a “mapping” of the source volume and the target volume. This mapping allows a point-in-time copy of that source volume to be copied to the associated target volume. The relationship exists between this volume pair, from the time that the flashcopy operation is initiated, until the storage unit copies all data from the source volume to the target volume or the relationship is deleted.
When the data is physically copied, a background process copies tracks from the source volume to the target volume. The amount of time that it takes to complete the background copy depends on the following criteria, the amount of data being copied, the number of background copy processes that are occurring and any other activities that are presently occurring.
In storage, the user can create a flashcopy that takes a point-in-time back up of some storage disks. If the user subsequently has a problem with their storage they can reverse the flashcopy to restore the saved version of the data. The direction of the flashcopy relationship can be reversed, where the volume that was previously defined as the target becomes the source for the volume that was previously defined as the source (and is now the target). The data that has changed is copied to the volume previously defined as the source.
An administrator can reverse a flashcopy relationship if they wish to restore a source volume (volume A) to a point in time before they originally performed the flashcopy operation. In effect, they are reversing the flashcopy operation so that it appears as though no flashcopy operation ever happened. The background copy process of a flashcopy operation must complete before it is possible to reverse volume A as the source and volume B as the target.
There will be certain circumstances when it is desired to reverse an original flashcopy relationship. For example, there may be created a flashcopy relationship between source volume A and target volume B, and then data loss occurs on source volume A. It is possible to reverse the flashcopy relationship so that volume B is copied to volume A.
Unfortunately, there are a number of disadvantages with this method of operating the data storage. For example, using the flashcopy function, it is straightforward to restore back to a point-in-time at which the flashcopy was taken, but this is not always the right time, in the context of the data recovery that is attempting to be performed. Similarly, when the copies are taken as a function of a clock time, as a background task, rather than based on what is going on in the system, the actual point-in-time may not be of any use relative to the data recovery. Even continuous data protection is not automated and has no notion of when is the sensible time to take backups. Additionally, copies tend to be large and include many sets of disks due to the nature of many interleaved systems being in play and the need to have them all cross-consistent. This creates a large processing and storage burden. Systems have to be quiesced and flushed in a monolithic manner. In most flashcopy scenarios the applications are stopped and the device drivers flushed of cache data prior to the flashcopy. This will flush all application data to the storage device, in order to create a clean image of the vdisk that is being flashcopied and may include data that was cached for other vdisks and applications that are not involved in the flashcopy.
Additionally, when a user creates a point in time backup copy, using flashcopy, of a virtual disk they are trying to fulfil the business requirement of taking a backup copy of a whole set of business data (that is stored on the disk) as it is at a particular point in time. There is a problem in doing this as there are many layers of caching, including possibly, for example, in the application and web-server middleware, in the database, in the file system and in the multipathing device driver. The current approach to taking a flashcopy of a set of business data is to stop all the application and middleware work that is using the storage disk and (typically) to shut down the associated middleware servers, forcing it to store all the data into the disk. This disk can then be flashcopied to create a consistent set of business data that can be backed up.
This is a problem if a user wishes to take a point-in-time copy of the business data, but does not wish to stop or shut down the applications or application server. As there is the desire to take a consistent set of data this means that the user does not wish to take a flashcopy of the data part way through a unit of work but as these units of work occur very rapidly and these all start and stop at “machine speed” it would be impossible to press a button, or indeed to start a flashcopy at exactly the point in time when an application has reached a consistent business state.
It is therefore an object of the invention to improve upon the known art.