The present invention relates generally to asynchronous trackable file replication used for disaster recovery in computer systems. More particularly, the present invention relates to asynchronously tracking the replication of active log files used in mirroring to a standby data processing site.
In recent years, the use of networked computer software and the Internet has brought about a significant increase in the amount of network traffic and transactions performed by software applications residing on networked servers. More information is stored by these networked applications and in remote database applications than ever before. These applications process a large number of purchase transactions, credit card transactions, electronic mailing lists, email functions, data distribution, batch processing, etc. Such systems contain very critical data, which must be constantly backed up so the information is not lost. Further, application end users also desire that networked applications and data should be available 24 hours a day and 7 days a week.
To provide robust services that are constantly available, computer systems must have redundant backup systems. It is inevitable that the primary system will fail on occasion. When the primary system fails, a backup system must be quickly available. A backup system can be located on-site with the primary system and then a secondary backup system can be located at a physically remote backup site. Having at least one backup system on-site is valuable because the networked applications can immediately failover to that application, if the primary system fails or crashes. A second backup system at a remote site is desirable because it protects against catastrophic failure at the primary site. This is sometimes called the standby site. If the primary site is disabled by an extended power outage, fire, or another disaster, then the remote standby system will be activated. A failover to an off-site standby system is relatively slower, but it provides a maximum amount of protection against total system failure. Specifically, this type of fail-safe system is valuable for applications that are connected to the Internet which need to be constantly available. In the event of a failure, the standby system is always ready to take over. Usually, the standby system is located in another building or in a geographically remote area.
For certain transactional systems, such as a database, an active transaction log is kept which tracks recent transactions. An archive log is then kept to store information from the active log after the active log has been filled or a certain time period has passed. To be able to mirror a transactional system between the primary site and the standby site both the active logs and then the archive logs must be transferred to the standby system. The archived logs are then entered into or applied to the standby system, which constantly keeps the standby system current. The active logs at the standby system provide a record of the transactions not yet archived, and are utilized on failover to the standby site.
The mirroring is constantly occurring and transaction logs must constantly be replicated to keep a database and its backup system synchronized. Typically, the replication or mirroring must be asynchronous because the data is often sent over wide area networks whose response time can vary significantly.
The invention provides a method for asynchronously tracking the replication of data writes from an application that is subject to system and network failure, to a standby data processing unit located at a standby site. The method includes the step of enabling access to sequence numbers created for use in replication of the data writes. Another step is sending the data writes from an application write to a remote mirroring module. Sequence numbers are assigned to each write accepted. Next is tracking a most recent local write sequence number for a local data write and a most recent replication sequence number for replicated data writes. It can then be determined when a specific data write at the local site has been successfully replicated at the standby site by correlating the most recent local write sequence number and most recent replication sequence number. An additional step is initiating replication of the data writes for which replication has not taken place.
In accordance with another embodiment of the present invention, the system includes a method for handling overflow of disk spool writes in a remote mirroring unit subsequent to a network or system failure. This method includes the step of sending data writes that need to be replicated from the application through the remote mirroring unit after the network or system failure has been repaired. A following step is recording the data writes that are not spooled in the disk spool due to the spool overflow. The unspooled data writes are recorded in a spool overflow list. The spool overflow list is used to enable subsequent resynchronization of a standby storage system with the primary storage system for the application.
Another detailed aspect of the present invention is a device for handling disk spool overflow in a remote mirroring system. The device comprises a data transaction application that includes a plurality of data blocks that are intended to be replicated in a mirrored storage subsystem. A primary remote mirroring module having a local disk spool is included and configured to send data blocks received from the data transaction application to a standby data processing system for replication. A standby remote mirroring module operates in the standby data processing system, and includes a standby disk spool to receive the data blocks sent by the primary remote mirroring module. The device also includes a spool overflow in the primary remote mirroring module, and a memory cache records the writes of data blocks when the local disk spool has overflowed. (Comment: This whole para is a repetition of what has already been said).
Additional features and advantages of the invention will be apparent from the detailed description which follows, taken in conjunction with the accompanying drawings, which together illustrate, by way of example, features of the invention.