People use various devices to manage information. For example, a person may use a desktop computer to manage a calendar, maintain an address book, and keep a to-do list. The person may also use a laptop computer, a tablet computer, or a smart phone to manage the same information. To be most useful, each device should have the latest version of the information. This can be accomplished by synchronizing the information between the different devices.
People may also share information with others. For example, students may share class notes, and a teacher may publish class assignments on a web site. People can get the latest updates by synchronizing each device.
Data synchronization generally involves detection and identification of changes on each device, together with conflict resolution. Various techniques exist for achieving these aspects of data synchronization, including (1) comparing data records item by item, (2) logging changes made at each device, exchanging logs, and applying changes from each device's log to the other device, and (3) comparing the versions of each data record and choosing the newest version.
To illustrate the prior art technique of synchronizing by comparing data records item by item, FIG. 1a shows a system comprising two exemplary peer devices; local device 110L and remote device 110R. Local device 110L includes a local data store 120L and remote device 110R contains remote data store 120R. FIG. 1b is an example of a local dataset 130L residing in local data store 120L and containing three exemplary data records. Remote dataset 130R is a copy of local dataset 130L residing in remote data store 120R. FIG. 1c shows datasets 140L and 140R after some changes have occurred to datasets 130L and 130R respectively. Record A has changed in local dataset 140L, record B has changed in both datasets 140L and 140R, and record C has been deleted in local dataset 140L. FIG. 1d shows datasets 150L and 150R after data stores 120L and 120R are synchronized. Record A is copied to remote dataset 150R, any conflicts in the changes in record B are resolved and the result is copied to datasets 150L and 150R, and record C is copied back to local dataset 150L. It is readily apparent that item by item record comparison suffers from a number of disadvantages; it gets slower as a dataset grows larger, it depends on the two devices sharing a common clock to find out which record is newer, and it does not handle deletion because it is not possible to tell whether a record has been added to one dataset or deleted from the other dataset after the last synchronization.
To illustrate the prior art technique of synchronizing data using a change log, FIG. 2a shows two exemplary change logs. Change log 210L summarizes the changes to local dataset 130L, and change log 210R summarizes the changes to remote dataset 130R. Change log 210L shows that data records A and B have been updated in local dataset 130L, record C has been deleted, and record D has been added. Change log 210R shows that record B has been updated in remote dataset 130R. FIG. 2b shows datasets 220L and 220R after data stores 120L and 120R are synchronized. Record A is copied to remote dataset 220R, any conflicts in the changes in record B are resolved and the result is copied to datasets 220L and 220R, record C is deleted from remote dataset 220R, and record D is copied to remote dataset 220R. Although the change log technique overcomes the record comparison and deletion disadvantages of the item by item comparison technique, it suffers from two new disadvantages; namely, additional storage for the change log and not knowing when to prune the change log if there are more than two peer devices being synchronized at different times. If a peer device is abandoned, the change log may never be pruned.
To illustrate the prior art technique of synchronizing data by comparing versions of each data record, FIG. 3a is an example of a local dataset 310L residing in local data store 120L and containing three exemplary data records. Remote dataset 310R is a copy of local dataset 310L residing in remote data store 120R. Each record is marked by a “version vector”, a pair of version numbers,{L1,R1}, corresponding to record versions in the data stores of devices 110L and 110R. FIG. 3b shows datasets 320L and 320R after some changes have occurred to datasets 310L and 310R respectively. Record A has changed in local dataset 320L, record B has changed in both datasets 320L and 320R, and record C has been deleted in local dataset 320L. FIG. 3c shows datasets 330L and 330R after data stores 120L and 120R are synchronized. Record A is copied to remote dataset 330R, any conflicts in the changes in record B are resolved and the result is copied to dataset 330L and 330R, and record C is deleted from remote dataset 330R. Although the version comparison technique overcomes the disadvantages of record comparison and deletion as well as the overhead of a change log, each time a new peer device is introduced the version vector of each data record grows because it must store the version for the new device. Even when a device is abandoned, the version vector retains the version number of the abandoned device.
Additionally, none of the techniques described above handles an interrupted synchronization intrinsically. A synchronization operation between two devices may be interrupted for a variety of reasons. For example, the user may manually abort the synchronization operation to perform an urgent task, one of the devices may lose power unexpectedly, or the devices may move outside wireless range and lose connectivity. In general, an interrupted synchronization means restarting the synchronization from scratch, or requires additional information about the synchronization state for later restart.
Prior developments have not taught or suggested any solutions to overcome all of the limitations described above, and thus, solutions to overcome these limitations have long eluded those skilled in the art.