A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The present invention relates generally to management of information or datasets stored on information devices and, more particularly, to systems implementing methods for maintaining synchronization of datasets among such devices.
With each passing day, there is ever increasing interest in providing synchronization solutions for connected information devices (CIDs). Here, the general environment includes CID in the form of electronic devices including, for example, cellular phones, pagers, other hand-held devices (for example, REX PRO(trademark), PalmPilot and Windows CE devices), personal computers (PCs) of all types and sizes, and Internet or intranet access devices (for example, PCs or embedded computers running, for example, Java virtual machines or browsers or Internet Protocol (IP) handlers).
A problem found in such an environment today is that these devices, and the software applications running on these devices, do not communicate well with one another and are typically not designed with data synchronization in mind. In particular, a problem exists as to how one integrates informationxe2x80x94such as calendaring, scheduling, and contact informationxe2x80x94among disparate devices and software applications. Consider, for instance, a user who has his or her appointments on a desktop PC at work, but also has appointments on a notebook computer at home and on a battery-powered, hand-held device that is used in the field. The user is free to alter such information on any one of these devices. What the user really wants is for the information (for example, appointments), in each device to remain synchronized with corresponding information in all devices in a convenient, transparent manner. Still further, some devices (for example, PCs) are typically connected at least occasionally to a server computer, for example, an Internet server, which stores information for the user. The user would of course like the information on the server computer to participate in the synchronization, so that the server also remains synchronized.
There have been attempts to solve the problem of synchronizing datasets across different devices or software applications, even if the datasets were not designed with mutual synchronization in mind. An early approach to maintaining consistency between datasets was simply to import or copy one dataset on top of another. This simple approach, one which overwrites a target dataset without any attempt at reconciling any differences, is inadequate for all but the simplest of applications. Expectedly, more sophisticated synchronization techniques were developed. In particular, techniques were developed for attempting to reproduce in each dataset the changes found in other dataset(s) since a previous synchronization. Techniques were developed for resolving any conflicts involving such changes, automatically or with user assistance. Some earlier examples of such techniques were limited to xe2x80x9cpoint-to-pointxe2x80x9d synchronization, in which exactly two datasets are synchronized. Later, xe2x80x9cmulti-pointxe2x80x9d techniques were developed by Starfish Software, Inc. (xe2x80x9cStarfishxe2x80x9d), the present assignee, that are capable of synchronizing arbitrarily many datasets using a single synchronization system or in response to a single interaction with a user. Starfish""s synchronization techniques are described for example in U.S. patent application Ser. No. 09/136,215, which has been incorporated by reference. Starfish""s synchronization systems may be implemented on server computers, such as an Internet server, to provide synchronization services to remotely located datasets, provided that the proper accessors for interfacing with datasets are available. A version of Starfish""s Internet-based synchronization system is called the TrueSync(copyright) Server, or xe2x80x9cTSSxe2x80x9d. (TrueSync(copyright) is a registered trademark of Starfish. REX(trademark) and REX PRO(trademark) are trademarks of Franklin Electronic Publishers of Burlington, N.J. REX and REX PRO devices include licensed technology from Starfish.)
A limitation of the existing synchronization systems is that they do not handle interrupted synchronization sessions in an efficient or always-desirable manner. This can be a problem, especially if the connection to a particular dataset is broken during a synchronization session, for example, due to a failure in the communication channel. If a synchronization is interrupted, much of the synchronization work that has already been performed cannot easily be used and is discarded. One bad consequence is that the user is typically forced to repeat an entire, time-consuming synchronization session with the dataset from the beginning, instead of being able to resume largely from where the previous session ended. Another bad consequence is that even though many dataset changes from the dataset may have already been tediously received and processed prior to the breaking of the connection, the user cannot immediately use these many dataset changes in other datasets. Instead, the user is generally forced to wait until the connection is restored and a full synchronization session with the particular dataset is repeated or re-performed. To some degree, users may be willing to tolerate such inconveniences when using synchronization systems (for example, PC-based implementations) that primarily use relatively reliable connection means, such as direct serial-line or PC-Card connections, to datasets. However, as information appliances increasingly use ever more diverse and potentially less reliable ways of connecting to synchronization systems, these inconveniences become less tolerable.
It is helpful to examine the above-identified deficiencies of existing synchronization systems in more detail. During a synchronization session, an existing synchronization system typically determines the changes that have occurred in a dataset, for example a PalmPilot organizer""s dataset, since a prior synchronization. After the synchronization session, the changes have been propagated by the synchronization system into other, target dataset(s) for use. These target dataset(s) may include ordinary user datasets, for example a PC-based PIM application. These target dataset(s) also may include a central-repository dataset controlled by the synchronization system itself, which dataset is sometimes referred to as the GUD, or Grand Unification Database. The problem with existing synchronization systems is that if a connection to a dataset being synchronized is broken during a synchronization session, any partial set of changes from the dataset that have already been seen or processed prior to the connection failure is not yet integrated into the target dataset(s). Therefore, the partial set of already-seen changes cannot be used in the target dataset(s) for user viewing or in other synchronization sessions. Further, the partial set of received changes generally cannot even easily be used, after the connection is restored, for resuming synchronization with the same particular dataset from roughly where the interrupted session left off. This xe2x80x9call-or-nothingxe2x80x9d approach with regard to making received changes available can cause significant delay and waste of resources. The inconvenience is especially likely and objectionable when synchronizing with datasets that connect to the synchronization system using frequently-broken or undependable connections, such as Internet-based or other remote connections. Further, the inconvenience is especially likely and objectionable when the synchronization session is long, for example because the dataset(s) being synchronized are large or because many datasets are being synchronized in a single (multi-point) session.
What is needed are systems and techniques that allow use of already-received changes in target datasets even if the synchronization session fails, and even before the failed synchronization session is re-performed or resumed to completion. What is also needed are systems and techniques for synchronization that minimize the need to repeat already-performed work after an interrupted synchronization session.
The present invention fulfills these and other needs.
The present invention makes possible synchronization of databases in a manner that allows use of already-received changes in target datasets even if the synchronization session fails, and even before the failed synchronization session is re-performed or resumed to completion. The present invention also makes possible synchronization of databases in a manner that minimizes the need to re-send dataset changes that have already been sent in an earlier, failed synchronization session.
According to an embodiment of the invention, a method is provided for synchronizing at least a first dataset and a second dataset, from a plurality of datasets, with a reference dataset, wherein a plurality of changes have been made to the first dataset since a previous synchronization of the first dataset with the reference dataset. According to the method, a description is stored of correspondence between data records of the reference dataset to data records of each of the plurality of datasets. Further, at least a first change of the plurality of changes is received from the first dataset for possible propagation to the reference dataset. After the receipt of the first change, the first change is propagated into the reference dataset, to the extent that the first change can be reconciled with the reference dataset, without requiring that all of the plurality of changes have already been received for possible propagation to the reference dataset. Any remaining changes of the plurality of changes, and any changes that have been made to the second dataset since a previous synchronization of the second dataset, are also propagated into the reference dataset, to the extent that such changes can be reconciled with the reference dataset. Additionally, changes are propagated to the first and the second dataset from the reference dataset, to the extent that such changes are not present at the first and the second dataset.