1. Technical Field
The present invention relates to data storage and retrieval generally and more particularly to a method and system of providing periodic replication.
2. Description of the Related Art
Information drives business. Companies today rely to an unprecedented extent on online, frequently accessed, constantly changing data to run their businesses. Unplanned events that inhibit the availability of this data can seriously damage business operations. Additionally, any permanent data loss, from natural disaster or any other source, will likely have serious negative consequences for the continued viability of a business. Therefore, when disaster strikes, companies must be prepared to eliminate or minimize data loss, and recover quickly with useable data.
Replication is one technique utilized to minimize data loss and improve the availability of data in which a replicated copy of data is distributed and stored at one or more remote sites or nodes. In the event of a site migration, failure of one or more physical disks storing data or of a node or host data processing system associated with such a disk, the remote replicated data copy may be utilized, ensuring data integrity and availability. Replication is frequently coupled with other high-availability techniques such as clustering to provide an extremely robust data storage solution.
Replication may be performed by hardware or software and at various levels within an enterprise (e.g., database transaction, file system, or block-level access) to reproduce data from a replication source volume or disk within a primary node (a primary volume) to a remote replication target volume or disk within a secondary node (a secondary volume). Replication may be synchronous, where write operations are transmitted to and acknowledged by one or more secondary node(s) before completing at the application level of a primary node, or asynchronous, in which write operations are performed at a primary node and persistently queued for forwarding to each secondary node as network bandwidth allows. The asynchronous mode of replication is the most complex form of replication due to the requirements of replication log management at the primary node and write ordering at the secondary node. Asynchronous replication requires writes to be ordered at the secondary node to ensure that the replicated volume is consistent. It also requires the writes to be ordered across a set of volumes if an application (e.g., a database) uses more than one volume at a time. Synchronous replication, while not requiring writes to be ordered suffers from sometimes significant I/O latency which depends on the characteristics of the network.
FIG. 1 illustrates a replication system block diagram according to the prior art. Primary node 100a of the illustrated prior art embodiment includes an application 102a (e.g., a database, mail server, web server, etc.), a file system 104a, a volume manager 106a including a volume replicator 108a or other replication facility, a primary data volume 110a or “replication source volume”, and a replication log 112a as shown. Volume replicator 108a of primary node 100a receives data (e.g., in conjunction with a write operation from application 102a, file system 104a, and/or volume manager 106a) to be stored within primary data volume 110a. Volume replicator 108a of primary node 100a then stores the received data within primary data volume 110a and transfers a replicated copy of the data at a block level to a corresponding volume replicator 108b within secondary node 100b over a network 114 (e.g., an IP network, LAN, WAN, or other communication link) coupled between primary node 100a and secondary node 100b. 
When replicating synchronously, volume replicators 108 are used to maintain primary and secondary site data synchronization. A write request from application 102a to a synchronously replicated volume such as primary data volume 110a is considered complete as soon as the update is logged at the primary node 100a, and, transmitted to and acknowledged by all secondary sites (e.g., secondary node 100b). Each secondary site confirms an update or write operation in two stages. A first confirmation acknowledges receipt of the update. A second confirmation, indicating that the primary node need no longer keep the update in its replication log 112a, is sent when data is on disk at the secondary site. Data to be written primary data volume 110a is synchronously replicated by first writing it to replication log 112a. Thereafter the data may be concurrently written to disk storage associated with primary data volume 110a and transferred to secondary node 100b. Once the data has been received, secondary node 100b confirms its receipt to primary node 100a so that completion of the write operation may be signaled to the write-initiating application 102a and stores the data on disk storage associated with the secondary data volume 110b. 
When replication is asynchronous, an application write completes as soon as volume replicator 108a has logged the update in replication log 112a. Transmission and writing to secondary data volume 110b is concurrent with continued execution of application 102a. Following transfer of data associated with a requested write operation to replication log 112a, completion of the write operation may be signaled to the write-initiating application 102a. Thereafter (or concurrently with the signaling of the write operation's completion) the data may be transferred to secondary node 100b. The data is then typically written to disk storage associated with primary data volume 110a followed by the storage of the data within replication log 112b, receipt confirmation by secondary node 100b to primary node 100a, the storage of the data on disk storage associated with the secondary data volume 110b, and an indication confirming the occurrence of the write to primary node 100a. 
A given node can serve as a primary node/replication source volume for one application and as a secondary node/replication target volume for another application. Furthermore, for the same application program, a given node can serve as a secondary node at one point in time, and as a primary node at another point in time to “cascade” replication of the data to other nodes connected via communications links. For example, a first replication may be made between nodes in different cities or states, and a node in one of the cities or states can in turn act as the primary node in replicating the data worldwide. Each replication primary node may also have more than one replication secondary node. As used herein, a reference to the secondary node implicitly refers to all secondary nodes associated with a given primary node unless otherwise indicated as identical replication operations are typically performed on all secondary nodes.