In a distributed network environment, multiple copies of replicated data, such as multiple copies of files, objects, directory structures, programs or databases, are often distributed throughout the network. For example, in a wide area network (WAN) comprised of multiple local area networks (LANs), a separate copy of replicated data may reside in at least one file server or workstation located on each of the LANs.
A benefit to having replicated data in the above-described distributed network includes facilitating access to the replicated data by each of the nodes on the network. Nodes may simply obtain the desired data locally on their LAN rather than seeking the data from another node on the WAN in a perhaps more costly and time-consuming manner. In addition, replicated data helps to distribute the load on any given node which would otherwise have to maintain the data and respond to all requests for such data from all other nodes on the network. A further benefit includes enhancing system reliability, e.g., no one node (which may fail) exclusively possesses access to required data. Databases, network directory services and groupware are typical products that take advantage of replication.
Since replicated data may change and multiple copies of the replicated data are distributed throughout the network, replication facilities must typically employ some scheme for reconciling any differences and ensuring a certain amount of consistency between the replica set. A replica set is considered to have strong consistency if the changes to the data are reconciled simultaneously throughout the set at some ordained time. Weak consistency is a concept which allows the replicas to be moderately, yet tolerably, inconsistent at various times.
As can be appreciated, maintaining strong consistency generally requires the use of more resources, e.g., at least in terms of reduced available bandwidth. Moreover, strong consistency becomes more and more impractical, and at some point almost impossible to guarantee, as the number of replicas increases in a distributed system. This is mostly due to performance limitations, network partitioning, and the like. Consequently, most replicated systems are designed to implement some level of weak consistency.
Replication can conceptually be considered to occur pairwise and in one direction. One replication facility that performs replication through the use of a standardized interface is described in U.S. Pat. No. 5,588147. The general replication topology described therein can be thought of as a graph of unidirectional edges where replication information is transmitted from a source to a destination. A replica node seeking replication (the destination) is responsible for originating a request for replication from a connected replica node (the source). This technique is known as pulling, since the destination attempts to pull the data from the source.
In the above facility, a cursor is maintained at the destination for each connection (edge) it has to a source from which it pulls data. Each cursor tracks the point of the last change information sent from a respective source to the destination. Using a cursor, when a destination requests replication from a given source, the source provides the destination with 1) a list of objects (or other data structures) that have changed and 2) the type of change which has occurred for each object since the last replication to that destination. To avoid unnecessary transmission, the source also filters from this list any change items which it knows were originated or propagated by the requesting destination. The source then updates the cursor maintained at the destination based upon the replication information provided during that replication cycle.
However, while such a replication facility has many advantages, it also has its inefficiencies. For example, resources may be unnecessarily utilized by transmitting redundant replicated data to a destination which already has some or all of the transmitted data. Such a situation may occur where a destination receives replicated data from more than one source. For example, if a destination replica A first requests and receives an update of replicated data from a source replica B and then requests an update of replicated data from a second source replica C, there is a possibility that replica A may have already received (via replica B in the first update) some or all of the replicated data which will be sent by replica C. Of course, the chances of receiving redundant replication information from an indirect path increase as distributed system topologies grow in complexity.
Although the transmission of some redundant replication information may be tolerable or insignificant at certain times depending upon the system resources, at other times such redundancy may be highly significant. For example, a substantial amount of redundant information is often replicated when the replication topology is changed. In the situation where a new connection is established between existing, but previously unconnected replica nodes, since the nodes had not been previously exchanging replication information, to maintain the desired replication scheme and ensure data integrity the entire replication information at each node is exchanged. Thus, one of the nodes initially functions as a source while the other functions as a destination, whereby replication and reconciliation of all replicated objects take place in a known manner. The two nodes may then reverse roles, whereupon both are consistent. As long as the new connection is maintained, the nodes exchange replication information on a regular basis, thus maintaining the desired consistency.
While this method of fully synchronizing newly connected replicas accomplishes the desired goal of consistency, a substantial amount of redundant information is often exchanged between replicas because of replication information which was indirectly received from other sources.
The problem of excessive transmission of redundant replication information is compounded when a new replica is added to a system and connected to a number of existing replicas. By way of example, if a new replica C is added to a distributed system and connected to both replicas A and B, then the new replica C will fully synchronize with A and then fully synchronize with B (or vice-versa). However, if A has already reconciled to some extent with B, then upon the second replication with B, replica C may pull a substantial amount of replication information from B which is duplicative of what it just pulled during the first replication from A. If the changes to the topology are even more complex, a relatively large amount of network resources can be inefficiently consumed through the communication of redundant information.