The age of the Internet, still in its infancy, has already produced a dramatic change in information access across the globe. While web sites and unstructured email predominate as the vehicles of choice for information access, numerous competing mediums, both old and new, are also benefiting from the profound power of this massively interconnected computer network. Among these, the Internet has breathed new life into the use of distributed database systems. Distributed Systems is used here to represent the propagation of multiple copies of like databases to disparate locations as opposed to the use of direct access to a central individual copy of a database. At first glance the use of Distributed Systems across the Internet appears in opposition to the principal of the Internet—real-time access. However, this not only overestimates the degree to which access is constant or acceptably fast for all concerned, but also ignores the benefits of redundancy, cross-referencing and decentralization.
With Distributed Systems—at least insofar as modifiable structured database systems are concerned—comes the need to keep multiple copies of any particular piece of data the same throughout all databases in which it may exist. The prior art provides many techniques to solve this need. These divide generally across three axis points: synchronization method (direct comparison vs. replication); distribution methodology (client-server vs. peer-to-peer); and modifiability (master-slave vs. open modification access).
The synchronization axis relates to the means by which two database systems are capable of knowing what changes have taken place on one or both databases. Direct comparison, is meant here to represent the process of direct comparison between two databases on a record-by-record basis to discover and correct differences between them. This can be distinguished from replication, which represents the general methodology of capturing a reflection of database operations as they occur to one database so that these operations can be played out or replicated on one or more other databases. The prior art has generally favored replication for its efficiency and scalability. Efficiency is gained by the fact that less communication is required between two databases if each comes to the table already knowing what changes have occurred to itself since last communication. Scalability is gained by the fact that the replication instructions which have been locally collected can be shared with multiple database copies, whereas a process of direct comparison requires unique comparison for each additional database (A to C; A to B; . . . A to n).
The distribution axis pertains to the method by which one database copy interacts with one or more other database copies to maintain synchronization. This is less about the actual communication link used and more about the structure of the community of database copies and the roles each will play in relation to each other. Client-server is used here to represent a centrally or hierarchically derivative methodology, whereby authority within the community emanates from a central point, through which consistency and security are ensured. The clients in the network do not communicate with each other, but rather with the server, each relying on their direct relationship with the server to accomplish synchronization with other clients. In the peer-to-peer methodology, however, each peer communicates with all other peers, none of whom play a role significantly different from each other. Each methodology has advantages and disadvantages. Client-server methodologies benefit from greater consistency and control over data, as well as reduction of processing demands on client machines, while suffering from limitations to scalability, problems concerning bottlenecks in data throughput and a lack of flexibility and distribution of authority. Meanwhile, peer-to-peer methodologies provide for a decentralized environment, helping to alleviate problems of bottlenecks in communications and offering greater flexibility, but suffer from increased demands on local processing, threats to data consistency due to a lack of central authority and also suffer limits to scalability in cases where direct connection is necessary between each peer which originates a change and each peer which must receive the change.
The modifiability axis concerns limitations on who can make changes to records. A dichotomy is formed in the prior art along this axis as to whether or not more than one site is allowed to modify any particular record. For obvious reasons, imposing this limitation greatly reduces the complexity of synchronization, because it removes the need to handle conflicting edits to the same records. However, this imposes very restrictive limitations on users of and interfaces into these systems, as this requires a user at a remote location to access data at the “master” location in order to modify it.
As illustrated according to the axis points described above, those examples in the prior art which allow for replication (vs. direct comparison) across either a client-server or peer-to-peer network and which allow for data to be modified from any copy of the database provide the greatest functionality to distributed systems. However, as also is illustrated above, both the client-server and peer-to-peer methodologies suffer limitations.