The value of data stored in computer systems is already very high, and that value is constantly increasing. The danger posed to governments, businesses and other organisations of losing data because of accidental or malicious events is regularly emphasised by reports of destructive “cracker” attacks, earthquakes, fires and other events which have left organisations unable to access data and continue operations. In some cases, businesses have failed because such an event has left them without adequate recovery capabilities. There are other circumstances, too, in which maximum continuity is necessary. Migration of a data processing facility to new premises can also cause disruption to normal business activities, and this poses a significant additional cost.
There is thus an increasing need for the use of business continuity facilities, including remote backup data storage and processing capability. The establishing of these business continuity systems and copying and keeping the remotely-stored backup data in step with that at the main site or system is a significant business and technical effort and can involve considerable investment.
The main technology that is increasingly used for such business continuity solutions is known as Remote Copy. In this technology, a set of source logical disks are used at the primary site (hereinafter simply referred to as the “primary”), while a set of target logical disks at the secondary site (hereinafter referred to as the “secondary”), are kept in synchronization with the source logical disks. The target logical disks are typically at a site that is geographically remote from the main site at which the source logical disks are kept. If an adverse event of any kind destroys the disks or the data at the primary, or in any way makes the data unavailable or unusable, the business can be continued using the data at the secondary. In the same manner, if a migration is planned, it can be staged to maximise the continuity of business operations using Remote Copy.
There are essentially two types of Remote Copy: synchronous and asynchronous. In synchronous Remote Copy, the destination logical disks are kept in lockstep with the source logical disks. That is, in synchronised Remote Copy, the requesting application does not receive completion to writes until both logical disks have been updated. In asynchronous Remote Copy, the destination logical disk writes may lag behind the source logical disk writes for some period of time. This difference is not material to the present description, and therefore for the sake of simplicity the description will only discuss synchronous Remote Copy.
One conventional current best practice technique for establishing a remote copy operation is described below with reference to FIG. 1. It involves taking a backup copy (102) of the primary database using, for example, a conventional backup method such as a tape dump utility. The contents of the tape are then either transmitted via a network or the backup tape is taken to the secondary and loaded (104). A remote copy of a volume containing the database redo logs is created (106)—this contains records of all the changes made to the primary since the taking of the backup copy (102). The redo log is then synchronised (108) to the secondary. The database is quiesced (110) at the primary. The redo logs are then reapplied (112) at the secondary. The remote copy of the data volumes is activated (114). Finally, the primary is reactivated (116).
It will be clear to one skilled in the art that this process is burdensome and potentially very costly to the business as the customer's applications are taken offline during the performance of steps 110 to 116. In addition, because it relies on undo-redo logging, and because ordinary flat-file systems of the art do not usually benefit from this type of logging, this technique cannot be used for all types of data storage.
An alternative technique would be to establish a remote copy relationship between a primary and a secondary site in which the storage at the secondary site is initially empty, and then to use conventional techniques to initiate synchronization at the remote secondary during continuing operation at the primary. This technique has several disadvantages. The first is that the process lasts an indefinite amount of time, during which the system is vulnerable to data loss if any failures occur at the primary. The second is that the connection between the primary and secondary sites is, in most cases, an expensive private leased line, such as a T1 line. Transmission of large amounts of data over such a line without any rapidly-established level of protection for the data is not generally acceptable as a reasonable business expense.
It would therefore be desirable to advantageously increase the speed with which a remote copy could be established, while avoiding the need to quiesce applications at the primary, allowing all types of data to be included in the scope of the remote copy. It would be further desirable to minimize the use of network resources and costs in establishing the remote copy.