1. Field
The disclosed technology is in the field of relational database replication, more particularly, the disclosed technology is in the field of scalable relational database replication.
2. Background
Aspects of a typical database replication system include the following:                Ensuring that all transactions written to the primary database are replicated in full to a secondary database.        Committing (or finalizing) transactions in the secondary database in exactly the same order as the commit action of each transaction in the primary database.        Recovery in the event of a failure, by switching the role of a database to either primary or secondary (as determined by the failure situation).        
Several forms of database replication exist, with the three most common methods described here. FIG. 1A shows an asynchronous log replication system that utilizes a database transaction log to perform replication asynchronously to secondary databases. In such a system, a primary database (103) and its transaction log (104) reside on a primary server (102). Transactions are written by a client program (101) to the primary database (103) and its transaction log (104) in a first operation. A separate process monitors the transaction log (104) and replicates transactions to a secondary database (106) residing on a secondary server (105) in a second operation. This method is the most commonly employed, and suffers from significant replication lag under high transaction rates, and cannot guarantee that all transactions are written to the secondary database (106). If for example a backlog of thousands of transactions are stored in the transaction log (104) awaiting replication, and the primary database experiences a failure, a portion or all of the pending transactions are lost. Thus when the system fails over to a secondary database, making it now the primary database, it has an incomplete copy of the original primary database, a condition that is intolerable for most commercial database applications.
FIG. 1B shows a two-phase commit synchronous replication system in which any single transaction must be written and committed successfully to both the primary and secondary databases before a transaction is considered successful. In this process, a client (107) sends all database write statements (shown as a first operation) to a primary database (109), residing on a primary server (108), and to a secondary database (111), residing on a secondary server (110). The client (107) then sends a commit request to a transaction coordinator (112) (shown as a second operation). The transaction coordinator (112) then sends a prepare message (shown as a third operation) to the primary database (109) and the secondary database (111), each of which must acknowledge in turn. Then the transaction coordinator (112) sends a commit message (shown as a fourth operation) to the primary database (109) and the secondary database (111), each of which must acknowledge in turn. The transaction coordinator (112) then sends an acknowledgement of a final commit message to the client (107) (shown as a fifth operation). This mechanism guarantees in most cases that the transaction fails or succeeds on both the primary and secondary database, however there is significant overhead introduced. The transaction coordinator (112) must wait for the all databases participating in the transaction to complete all processing and respond. Further the use of a centralized transaction coordinator (112) creates a bottleneck, limiting scalability as client processes are added to the system, further slowing performance. Lastly, this mechanism requires at least two databases be available at all times for a transaction to be performed reliably. If a failure of one of the participating database occurs, it must be rebuilt on a standby server, which can incur a significant time delay (minutes, hours or longer). Alternatively there can be at least three running servers, each participating in a two-phase commit transaction, but this further adds to system overhead, processing delays and adds significant cost to the system.
FIG. 1C shows a middle-tier replication system that normally requires at least one middle-tier server and process (114) located between a client (113) and a primary database (116) and a secondary database (118). The middle-tier server (114) receives and relays all transactions from the client (114) to the participating databases in a synchronous manner. This approach adds both overhead and cost to the system, as the middle-tier must be redundant (requiring at least two middle-tier servers), and acts as a bottleneck which cannot scale effectively as additional clients (114) are added to the system.
There are several drawbacks in these replication systems. Synchronous replication mechanisms cause a significant degradation in overall database performance, by enforcing the completion of an entire transaction on both the primary and secondary databases. In these systems a centralized transaction coordinator is often required, which must ensure that transactions complete fully across all participating databases. This limits scalability by creating a bottleneck at this centralized component, and also requires additional complexity to ensure that this single component does not experience a failure, as the system cannot operate without it.
Asynchronous replication systems avoid the performance penalty, but do not guarantee the success of transactions on the secondary databases, and can experience major delays (referred to as “replication lag”) between transactional writes to the secondary database after transactions are performed on the primary database. Further, replicated transactions must be committed in the exact order performed on the source database. This requires writing one transaction at a time on the replicated database, which slows performance and increases replication lag.
If a failure occurs, in either a synchronous or asynchronous replication system, a database system ideally should continue to operate in a reliable fashion. Some systems only support continued operation after a failure on the single remaining operating member of the system (such as the primary or secondary database that is still operational). In this case, there is no protection against a succeeding failure of the remaining components until the original failed component has been recovered. Alternatively, some systems require at least two active secondary databases in the system which is costly.
These conditions limit the effectiveness and usefulness of database replication systems, particularly when high-performance transactions and high-availability are required. Each of the prior methods possess one or more drawbacks for high-performance systems that must be process a large volume of transactions, must be reliable, and scalable without a reliance on centralized components.