1. Field of the Invention
The present invention generally relates to methods and apparatus for the use of synchronizing digital information combined with digital software code in a way which allows the data to be created or stored and then sent via various different network connections, such as client-server or peer-to-peer connectivity with out producing duplication errors and keeping all data globally addressable regardless of where on the network it is created or stored. The method can be used to track documents with their revisions in a networked environment where the there is no need for a single database to keep track of identifiers used to addressing the documents.
2. Prior Art
Previously there are several types of methods for synchronizing data in client-server or peer-to-peer network environments. These efforts focus around the synchronization of data objects, such as source code control systems, software interfaces, such as distributed execution frameworks, client-server applications, such as some email applications or specialized form filling applications. The following paragraphs detail the broad categories in which these systems operate to provide background for the present invention.
Source code control type systems work on the principle that each client computer is storing local copies of specific versions of files which are controlled and stored in a master database. This database is often called the master server, the source control server, the repository or depot. In the depot a SCDB (Source Control Database) save each specific file and its entire history as a series of versions. In other words each file which is stored in the SCDB has any version of that file over time. A client computer in the system can save new versions of the file to the SCDB which will add the new file as the top or head entry. At any time a client computer can request any of the previous versions of the file or add new files to create a new or top entry. Internally the SCDB may store differences between the various versions of the file to save space and in a sophisticated source control system only the differences may be transmitted between the client and the server to save communication bandwidth over the network. The set of files the client has at any given time is called a view. Within a view a client cannot automatically refer to multiple versions of the same file—only by explicit linking such as through programmatic control interface such as that of a config file. This can be cumbersome if a user wishes to store multiple versions of the same file and does not have detailed knowledge about source control systems and impossible if the user is not versed in technology such as config files.
Distributed software interfaces such as COM, DCOM, COM+ from Microsoft and CORBA by the Object Management Group consortium, use universal identifiers to synchronize interfaces of distributed components. These types of distributed systems work by a method of interface description and interrogation followed by an agreed upon calling convention. For example a component on a personal computer may make a request for a certain type of software object or interface. The distributed mediator system (for example COM/DCOM/COM+ in Microsoft or CORBA subsystem in OMG) then looks to see if that interface exists on a component either on the local computer or on a network attached computer. In order to distinguish the different software interfaces apart across vendors and across versions of software a UUID (Universally Unique IDentifier) system is used. In this method all interfaces are assigned, via a complex generation technique, a random number of sufficiently long length (say 128 bits or more) that any machine or software vendor can generate such a number and be confident that the number is unique in the entire software universe. This means there is no need for a centralized service which distributes the unique identifiers. These UUIDs are then used as the identifiers to which exact software interfaces are bound. For example a software service on one machine will as for an interface by its number, and the subsystem will then look to see if it can find a component which has this number as its designation. Upon finding it the calling component can be confident it is dealing with an interface it knows how to programmatically interact with. A secondary portion of this system is the use of Interface Description Languages (also called IDLs). Once the interface with the corresponding UUID is located and the underlying system facilitates the communications necessary, the calling program can programmatically test that the functional parameters, data structures, and method calls it wishes to use our supported. While such means are excellent at solving the problem of interface description and distributed method calling conventions they are not used for actually synchronizing files and data objects. This level of detail is left up to the various software modules to negotiate among themselves. For example a source control system could use this method to discover clients and server understanding the source control exchange protocol for differential updates, but the actual synchronization of the data and its versions is left up to the internal programming inside source control software modules.
Email systems and distributed form filling applications, such as handheld client server applications, work to give users varying amounts of service when the user is connected to the network and then attempt o bring a portion of that service to the user even when the network connectivity is not available. In email systems this is done in one of two general ways. The first method is to provide a web based interface, such as those provided by major web service providers or search engine firms, however these lack the ability to work offline so often a separate native software program is written to interface to the server and provide offline email services via a standard protocol such as IMAP or POP3.The challenge of such web-client software hybrid solutions is that the user experience is extremely disjoint—even when the software is working properly, the user interface between the web based presentation and the native software is quite different and the user must maintain a mental model of the buttons, control and layouts which correspond to like actions. Another method to achieve distributed email functionality is to use native software clients exclusively; however this results in the native software needing to be installed on every machine with which the user wishes to access their email. At times this is not possible, such as when a user logs in on from a guest terminal at an internet cafe. The present invention, as shall be explained in the objects and advantages section, addresses this need in a systematic fashion. Finally for form filling and survey applications, including, CRM (Customer Relationship Management), are primarily designed as online systems (for example web based where the user logs in to a centralized server or database application). The data for these types of applications is constantly being updated with various new information from distributed users or other services (such as price lists and quotes). However many persons with in the organization such as a sales force needs some portions of the application to be available when connectivity is not present such as during a customer visit. For these types of situations many systems either choose the email type solution discussed earlier where a separate client application is installed on a local device such as the salesperson's laptop computer or the form is cached as page. Both of these solutions fail to create a truly seamless and portable experience—either the experience is not portable across multiple computers or guest computers or parts of the information are made available in a piece-wise manner resulting in non-optimal solutions from a productivity perspective.
The following directly cited prior art is related to the present invention while not addressing all the advantages of the present invention.
The following art demonstrates the state of the art in data synchronization for web based applications and data.
U.S. Pat. No. 6,954,757 describes a “Zero Latency Enterprise” ZLE which is a business model for showing the benefits of distributed synchronized software but does not have a method for performing the synchronization of data and applications without user intervention or without a programmer rewriting their code for the various supported platforms.
The following references describe methods for generation of universally unique identification numbers either with code or with specifications which are adopted by standards bodies: “GUID generation” <http://www.vbaccelerator.com/codelib/tlb/guid.htm> and
RFC 4122—UUID generation (IETF) <http://www.faqs.org/rfcs/rfc4122.html>. These references do not describe how to setup up a synchronized data system and neither describes methods for synchronizing running application code. Related to this is Berners-Lee, T., “Universal Resource Identifiers in WWW,” RFC 1630, June 1994. describes how such universal links can be made compatible with the World Wide Web infrastructure.
To create UUIDs it is common to use various cryptographic hash algorithms. The two most common the MD5 and SHA-1 are both listed here: Rivest, R., “The MD5 Message-Digest Algorithm”, RFC 1321, April 1992; National Institute of Standards and Technology, “Secure Hash Standard”, FIPS PUB 180-1, April 1995, <http://www.itl.nist.gov/fipspubs/fip180-1.htm>. The present invention makes use of these as building blocks. Neither of these standards describes a system other than mechanisms for generating the unique identifiers mostly for purposes of digital signature generation.
The following methods show how to synchronize LDAP (distributed directory service) http://www.openldap.org/conf/odd-wien-2003/jong.pdf, however it does not exploit the use of UUIDs which can be generated by any device on the network and does not investigate synchronization of elements outside that of which is necessary to run a distributed directory service.
Several methods exist to replicate a database, whether a relational database or file based database (source control) but these methods do not address the issue of generating keys and signatures for each item asynchronously of each other. In the relational database the data is cloned from one database to another and the idea that primary keys in the same namespace can be asynchronously generated to manage the individual data elements is not used. Instead keys from a single database are copied to the new database. If two different databases are to be merged than colliding primary keys must be handled by the processing performing the merging.
U.S. Pat. No. 5,920,863 “System and method for supporting transactions for a thin client lacking a persistent store in a distributed object-oriented environment” describes how to use UUIDs to manage remote procedure calls such as described in the above section on distributed software interfaces. It does not detail the rules for seeding the UUIDs or for synchronizing data or entire applications to the local computer.
U.S. Pat. No. 6,988,137 “System and method for synchronizing objects between two devices” uses UUIDs (called GUIDs in the patent) for synchronizing two sporadically connected volumes and then managing subIDs within the volume to synchronize the data. However this does not allow for synchronization across multiple servers (volumes) and also ties data to the volume upon which it is stored. The present invention sets up a method of synchronization where the elements are independent of the volume types or labels upon which they are cached or stored.
U.S. Pat. No. 5,574,898 “Dynamic software version auditor which monitors a process to provide a list of objects that are accessed” demonstrates a modern source control system in which versions are maintained in a centralized database and views are managed by end developers (client computers) to access specific file versions. However this requires a great amount of sophistication upon the part of the user to manage the various versions of the files. Also there is no way to provide synchronization across multiple servers as there must be a master server maintaining all the versioning data. Lastly the system deals with synchronizing files, such as those used for programming activities but does not handle dealing with synchronized metadata applied to those files or to actual running programs and program distribution.
U.S. Pat. No. 6,374,289 “Distributed client-based data caching system” describes a data system for caching distributed data and for efficiently describing, where, in a peer to peer environment, the data can be found. However this does not address the needs of determining in a concise manner which data is the same in the network or for dealing with applications rather than data which may be distributed to client computers.