This invention relates generally to systems that use database replication and more particularly to a system that provides replicated transaction consistency. Replicated transaction consistency means that concurrent transactions accessing the primary and replicated data get the same results that they would get if they were executed serially and without the presence of the replicas.
Replicated database systems include the Sybase Replication Facility, manufactured by Sybase Inc., Emeryville, Calif. and described in the Sybase Replication Server Manual. The replicated system utilizes one or more primary database sites and local replication sites. Selected portions of the primary database are copied to the local sites and used by local transactions running at the local replication sites. Each primary site contains the authoritative copy for a portion of the data. For each data item in a local replica, only one primary site contains the authoritative copy. Transactions running at the replication sites may examine the replica data contained at the replication site itself, examine primary data stored at one or more primary sites, or examine both replica and primary data.
Data items held at a replication site are not locked at the primary site, other than for a short time while data is initially copied from the primary to the replication site. All updates performed by local transactions are transparently relayed to the primary sites. The replication sites receive permanent updates via update notifications from the primary sites.
Data replication facilities are based on subscriptions placed by the replication sites on the primary sites. Each subscription names a primary site table and an optional predicate to select primary site tuples for replication. When a transaction commits at a primary site, updates are sent to each replication site with a matching subscription. The updates sent to each replication site are transmitted in packages. Each package corresponds to a single primary site update transaction. The packages are sent from the primary site to the replication sites in commit time order. For example, if transaction T1 commits before transaction T2, the package for T1 will be sent before the package for T2.
Replication systems typically provide no guarantee of timeliness for transmitting the packages. In general, the replication system is expected to transmit update packages in a timely manner, but the transmission can be delayed arbitrarily by system load, network congestion, etc. Transactions that run against the replicated data cannot determine when data items were last part of a committed state for the primary database. Therefore, if precautions are not taken, a transaction might act upon versions of data that are no longer current.
Traditional solutions to the replication consistency problem apply to remote buffering of database pages. A combination of locking and update notification keep remote buffer pools consistent. To conduct a local transaction, global locks are placed on objects that are read during transaction execution. The global locks guarantee the proper synchronization with cache invalidation or cache update protocols that execute when data is updated.
An article entitled: Asynchronous Locking, IBM Technical Disclosure Bulletin, 1985 by Kurt Shoens, discloses a method of arbitrating global locks in a shared-disk cluster. The locks discussed in Shoens are requested and granted asynchronously using Lamport's logical clocks to establish correct ordering of global events. Shoens retains an abbreviated lock history at a master site that is consulted to see if a lock could have been granted at the time it was requested. The lock history consists of an unlock time stamp per lock name plus a global time stamp indicating how far back the lock history extends. Unlike a usual lock manager, the lock manager has to be able to tell whether a lock could have been granted at some time in the past.
In a replication environment, it is not practical to use locks on referenced data items. Locks typically apply to the physical structures of a database system. To determine which locks correspond to particular data items, most of the access protocol must be run at the primary server. Controlling all data access from the primary site removes the scaling advantage that replicated database systems offer.
An article entitled: Efficient Optimistic Concurrency Control Using Loosely Synchronized Clocks, Proceedings SIGMOD Conference, 1995, A. Adya, R. Gruber, B. Liskov, and U. Maheshwari, describes an implementation of optimistic concurrency control using loosely synchronized physical clocks. A distributed transaction wishing to commit is subjected to serial validation at each site that contributed data, regardless of the degree of contribution or whether the site was read-only to the transaction. More site validation is required as the number of replication sites increase. Thus, the system cannot be scaled to operate efficiently with a large number of replication sites.
Thus, a need remains for ensuring transaction consistency for replicated data systems while maintaining effective access and update response at each primary and replication site.