Under certain conditions, it is desirable to store copies of a particular body of data, such as a relational database table, at multiple sites in a distributed compute network. If users are allowed to update the body of data at one site, the updates must be propagated to the copies at the other sites in order for the copies to remain consistent. The process of propagating the changes is generally referred to as replication. Various mechanisms have been developed for performing replication. Once such mechanism is described in the commonly assigned U.S. patent application Ser. No. 08/126,586 entitled "Method and Apparatus for Data Replication," filed on Sep. 24, 1993 by Sandeep Jain and Dean Daniels, now abandoned, the contents of which are incorporated herein by reference
The site at which a change is initially made to a set of replicated data is referred to herein as the "source site." The sites to which the change must be propagated are referred to herein as "destination sites." If a user is allowed to make changes to copies of a particular table that is replicated at different sites, those sites are sources sites with respect to the changes initially made to their own copy of the table and destination sites with respect to the changes initially made to copies of the table at other sites. Replication does not require an entire transaction executed at a source site to be re-executed at each of the destination sites. Only the changes made by the transaction need to be propagated. Other types of operations, such as read and sort operations, that may have been executed in the original transaction do not have to be re-executed at the destination sites.
There are two basic approaches to replication: synchronous replication and asynchronous replication. In synchronous replication, each update or modification of a body of data is immediately replicated to all other replicas or copies of the body of data within the distributed network, typically by techniques such as a two-phase commit. The transaction that modifies the body of data is not allowed to complete until all other replicas have been similarly updated. Although synchronous replication provides a straightforward methodology for maintaining data consistency in a network this method is susceptible to network latencies and intermittent network failures. Furthermore, synchronous replication cannot prioritize updates; accordingly, low priority updates can unnecessarily produce significant system delays.
On the other hand, in asynchronous replication, local replicas of a particular data structure are allowed to be slightly different for a time until an asynchronous update is performed. During asynchronous replication, a distributed node can modify its local copy of a data structure without forcing a network access as in synchronous replication methodology. At some later point time in time, the modification is propagated to the destination sites. Various techniques for asynchronous propagation have been developed for example, remote procedure calls (RPCs) and deferred transaction queues.
In asynchronous replication, conflicts in updating a body of data might occur if two sites concurrently modify the same data item before the data modification can be propagated to other sites. If update conflicts are not first detected and then handled in some convergent manner, the data integrity of the replicated copies will begin to diverge. FIG. 3 illustrates a typical update conflict scenario. Site A 310 and site B 320 are shown having copy 312 and copy 322, respectively, of a replicated table called "emp." In this example, the "emp" table is a body of data that stores information about employees and is organized into rows and columns. The columns of table "emp" record attributes about each employee, such as an employee number ("empno"). the name of the employee ("ename"), a commission figure ("comm"), and an accrued bonus level ("bonus"). The rows of table "emp" refer to individual employees, for example, employee number 100 is named Jones, has a commission figure of $20, and accrued a bonus of level zero.
At site A 310, an update request 314 is processed to increase Jones' commission by $75, resulting in local table 316. Concurrently, however, another update request 324 is processed at site B 320 to increase Jones' commission by $280, resulting in local table 326. In FIG. 3, the particular update requests 314 and 324 are illustrated by a SQL (structured query language) statement. Modification information 318, comprising old and new values from site A 310, is propagated to site B 320 via replication mechanism 330. In this example, the old values for employee number 100 are the name of "Jones," a commission of $20 and a bonus level of zero, and the new values for employee number 100 are the name "Jones," a commission of $95, and a bonus level of zero. A conflict detection mechanism 332 at site B 320 receives the old and new values and checks for a conflict by comparing the propagated old values and the current values for the row 328. Since the current value for the commission for employee number 100 is $300 at site B 320, but the old value for the commission for employee 100 is $20, propagated from site A 310, the conflict detection mechanism 332 is able to detect a conflict. When a conflict is detected, one or more appropriate conflict resolution routines may be applied until the conflict is handled as described in detail in the commonly assigned U.S. application Ser. No. 08/618,507.
Therefore, conflicts can be detected by comparing a current value for a column or attribute at a destination site with a propagated old value from another site. Thus, the data conflict mechanisms at destination sites need to know the new value that was propagated from the source site, the old value at the source site, and the current value at the destination site. One disadvantage with conventional approaches to asynchronous replication with conflict detection is an excessive amount of data that is transmitted from one site to another site in the network. Propagating the old and new values impose an overhead in transmitting changes over the network and in temporary storage, especially if the data size of the old and new values is large.