Under certain conditions, it is desirable to store copies of a particular set of data, such as a relational table, at multiple sites. If users are allowed to update the set of data at one site, the updates must be propagated to the copies at the other sites in order for the copies to remain consistent. The process of propagating the changes is generally referred to as replication.
The site at which a change is initially made to a set of replicated data is referred to herein as the source site. The sites to which the change must be propagated are referred to herein as destination sites. If a user is allowed to make changes to copies of a particular table that are at different sites, those sites are source sites with respect to the changes initially made to their copy of the table, and destination sites with respect to the changes initially made to copies of the table at other sites.
Two types of replication systems are in use today: synchronous replication systems and asynchronous replication systems. In synchronous replication systems, no change by a transaction is considered permanent until all changes made by the transaction are successfully applied at the source site and at all of the relevant destination sites. A technique known as two-phase commit may be used to ensure the proper operation of synchronous replication operations. Two-phase commit is generally described in "Notes on Data Base Operating Systems", Gray, J. N., IBM Res. Rep. RJ2188 (Feb. 1978), "Operating Systems: An Advanced Course", R. Bayer, R. M. Graham, and G Seegmuller, Eds., Springer-Verlag, Berlin and New York, 1979 p.p. 393-481, and U.S. Pat. No. 5,452,445 entitled "Two-Pass Multi-Version Read Consistency", filed on Apr. 3, 1992 and issued to Hallmark et al. on Sep. 19, 1995, the contents of which are incorporated herein by reference.
Asynchronous replication systems separate the task of making the changes at the source site from the task of making changes at the destination sites. Changes made by a transaction are made permanent at a source site without respect to whether the changes have been made permanent at any of the relevant destination sites. Typically, records of the changes are simply stored in a queue at the source site, to be propagated to and applied at the destination sites at a later time. One mechanism for performing asynchronous replication is described in U. S. patent application Ser. No. 08/126,586 entitled "Method and Apparatus for Data Replication", filed on Sep. 24, 1993 by Sandeep Jain and Dean Daniels, (hereinafter "Jain"), the contents of which are incorporated by reference.
Relative to asynchronous replication systems, synchronous replication systems have the advantage that all replicated copies are always up-to-date and identical. In contrast, replicated copies at some sites in asynchronous systems may contain data that has already been superseded at other sites. Further, in asynchronous systems it is possible for events at destination sites to make it impossible for certain changes to be replicated at the destination sites. Because the changes have already been made permanent at the source site, asynchronous systems must provide some mechanism for conflict detection and resolution, such as that described in U.S. patent application Ser. No. 08/618,507 entitled "Configurable Conflict Resolution in a Computer Implemented Distributed Database", filed by Souder et al. On Mar. 19, 1996, (hereinafter "Souder"), the contents of which are incorporated herein by reference.
Asynchronous systems have an advantage over synchronous replication systems in that asynchronous systems allow data to be committed at the source site sooner than in synchronous systems. Specifically, each operation that changes replicated data in asynchronous systems does not have to be preceded by the handshaking operations between the source and destination sites that are required in synchronous systems. Further, changes may be made permanent at a source site in an asynchronous system even though one or more of the destination sites is not currently available. This is particularly important when one or more sites in the replication system will be disconnected from the system on a reoccurring basis, such as when the host for one of the sites is a portable computer.
Synchronous systems and asynchronous systems represent two extremes in the trade-off between consistency and availability. Synchronous systems enforce absolute consistency, but cannot operate well in networks where sites may not always be available. Asynchronous systems continue working when sites become disconnected and reconnected. However, this availability is gained at the expense of data consistency between the sites.
Based on the foregoing, it is desirable to provide a mechanism which allows users to select a replication configuration that embodies a balance between availability and consistency according to the needs of their specific system. It is desirable to provide a system that allows users to rely on the consistency of synchronous replication where consistency is required, and enjoy the availability of asynchronous replication where such availability is required.