Generally described, replication is a set of technologies for copying and distributing data and database objects from one database to another and then synchronizing between databases to maintain consistency. Using replication, data may be distributed to different locations and to remote or mobile users over local and wide area networks, dial-up connections, wireless connections and publicly accessible networks of networks, such as the Internet.
Transactional replication can be used to replicate transactional data, such as a database or other form of transactional storage structure. Database replication can be used to describe scenarios in which database management systems attempt to replicate data in order to ensure consistency between redundant resources. Database replication can commonly be associated with master/slave relationship between the original and the copies. In a master/slave relationship, one database may be regarded as the authoritative source of data, and the slave databases are synchronized to it. The master logs the updates, and the updates are then sent to the slaves in order to synchronize them. The slave outputs a message stating that it has received the update successfully, thus allowing the sending (and potentially re-sending until successfully applied) of subsequent updates.
Multi-master replication, where updates can be submitted to any database node, and are then sent through to other servers for synchronization, is often desired, but may introduce substantially increased costs and complexity which may make it impractical in some situations. One common challenge that exists in multi-master replication is transactional conflict prevention or resolution. Most synchronous replication solutions do conflict prevention. Conflict prevention is typically accomplished by not considering a write operation completed until an acknowledgement is received by both the local and remote databases. Further writes wait until the previous write transaction is completed before proceeding. Most asynchronous solutions do conflict resolution. Conflict resolution is typically accomplished by considering a write operation completed as soon as the local database acknowledges the write. Remote databases are updated, but not necessarily at the same time. For example, if a record is changed on two nodes simultaneously, a synchronous replication system would detect the conflict before confirming the commit and would abort one of the transactions. An asynchronous replication system would allow both transactions to commit and would run a conflict resolution during resynchronization. The resolution of such a conflict may be based on a timestamp of the transaction, on the hierarchy of the origin nodes or on more complex logic.
Database replication becomes difficult when it the number of databases and/or the locations between the databases increases. Typically, a centralized relational database may be used to store data for a variety of services across several hosts. In such a system, a simple request for data would be sent to all the hosts, and each of the hosts would need to access the relational database to obtain the requested data. The plurality of access requests to the centralized relational database may strain the database. One solution has been to use localized caches on the hosts, to reduce the number of access requests to the centralized database. The localized caches typically store local copies of frequently accessed data, thereby reducing the number of access requests to the centralized database. The use of caches may thus allow for some scalability. However, as the data requirements grow, and larger caches are needed, there may be issues such as shortage of random-access-memory (RAM). The use of multiple caches may create coherency issues. Sticky routing may not always be applicable to such systems. When the number of hosts and associated caches is scaled to a large enough number, the centralized relational database may simply get overloaded and become unresponsive.
One possible solution to the overloading of a centralized relational database has been to scale with partitions. Caches on the hosts may be partitioned to access multiple relational databases. However, such a solution does not really improve availability, since the two databases are not replicates of one another. Overall, basic caching is not ideal since cache parameters require tuning, partitioning becomes a necessity, use of more partitions means more failures and availability is not improved.