1. The Field of the Invention
The present invention generally relates to replicating data stores. More specifically, the invention relates to replicating data stores using knowledge of the changes that a particular data store is aware of to enumerate changes and detect conflicts.
2. Background and Relevant Art
In today's world of digital information handling, individuals may store information or data in a variety of different devices and locations. Often the user stores the same information in more than one device and/or location. Obviously, the user would like all of the various data stores to have the same information without having to manually input the same changes to each data store. Replication is one process used to ensure that each data store has the same information.
For example, a user may maintain an electronic address book in a myriad of different devices or locations. The user may maintain the address book, for example, on a personal information manager stored on their desktop computer, on their laptop computer, in a personal digital assistant (PDA), in an on-line contacts manager, and the like. The user can modify the electronic address books in each location by, for example, adding a contact, deleting a contact, or changing contact information. One goal of replication is to ensure that the change made on a particular device is ultimately reflected in the data stores of the user's other devices.
One common replication method involves keeping track of changes that have occurred subsequent to a previous replication. For example, a device that is seeking to be replicated with another device may submit a request for changes. Hopefully, the changes that will be sent are those that have occurred since the last replication. The replica sending updated information checks for any changes that are time stamped subsequent to a previous replication. Any changes with such a time stamp are sent to the device requesting replication. Currently, replication typically requires that each replica be aware of the other replicas or the topology in which it is operating. Each replica also maintains a record of what changes have been replicated on other replicas. In effect, each replica must maintain information about what it believes is stored on the other replicas within the topology.
This type of replication does not provide a user with adequate assurance that each replica is correctly replicated with other replicas in the topology. Problems with conflicting data may arise when changes are made to the same data stored in different replicas. For example, a user may make changes to a contact stored on their desktop computer and subsequently make different changes to the same contact stored on a PDA. Another problem may arise with respect to changes made to different portions of corresponding objects on different replicas. For example, a change may be made to the address of a contact on the desktop computer where a phone number change may be made to the same contact on the PDA. Replicating the entire contact would likely result in one of the changes being lost during replication.
The challenges of replication become more complicated when more than two replicas are included in the same sync community or topology. Among these challenges are problems involving replacing more current data with outdated data based on the order devices are replicated, sync loops in which data in the replica is continually updated and replaced with alternating versions of the data, incurring increased overhead by replicating data that may already be in sync and having data that is in sync being reported as being in conflict.
For example, consider a sync community that includes three replicas. Replica 1 is updated at time 1 by a user. At time 2, the same data is updated in replica 2. Replica 2 then replicates with replica 3. When replica 1 subsequently replicates with replica 3, the data updated on replica 2 may be replaced with the updated data from replica 1. As a result, data that is chronologically more current may be replaced by out of date data.
Communication resources may also be wasted when multiple replicas incorrectly believe that information is out of synch such that a synch operation is performed. For example, the three replica sync community. Replica 1 is updated by a user. Replica 1 then replicates with replica 2. The information in replica 2 is updated by the replication to reflect the changes to replica 1. Replica 2 then replicates with replica 3 such that the information from replica 2, which is currently on replica 1, is updated on replica 3. Replica 3 then replicates with replica 1. Replica 3 does not know what version of information is on replica 1, but only knows that replica 1 has been updated. Thus, replica 3 replicates its information with replica 1 where the information is the same information already on replica 1. Thus needless data communication resources are utilized in the unnecessary replication. Additionally, other needless replications may continue as replica I replicates with replica 2 or in other pair-wise replications at subsequent times.
In some cases, replicated data may actually appear as being in a conflict. For example, consider a three replica sync community. The information on replica 1 is updated and replicated with replica 2. The information on replica 1 is then replicated with replica 3. Replicas 2 and 3 then attempt a replication only to discover that they each have changes (the replication with replica 1) that have occurred since their last replication. Thus, data that is actually replicated appears to be in conflict.
In other words, replication between two or more other replicas is subject to various problems including unnecessary replications, wasted bandwidth, false conflict detection, inaccurate conflict resolution, and the like. These problems are magnified when the various replicas being replicated speak different protocols.