1.0 Field of the Invention
This invention relates to a database management system; and in particular, this invention relates to replicating data in a database management system.
2.0 Description of the Related Art
Database management systems allow large volumes of data to be stored and accessed efficiently and conveniently in a computer system. In a database management system, data is stored in database tables which organize the data into rows and columns. FIG. 1 depicts an exemplary database table 20 which has rows 22 and columns 24. To more quickly access the data in a database table, an index may be generated based on one or more specified columns of the database table. In a relational database management system, specified columns are used to associate tables with each other.
The database management system responds to user commands to store and access the data. The commands are typically Structured Query Language statements such as SELECT, INSERT, UPDATE and DELETE, to select, insert, update and delete, respectively, the data in the rows and columns. The SQL statements typically conform to a SQL standard as published by the American National Standards Institute (ANSI) or the International Standards Organization (ISO).
Departments within an enterprise may have their own database management systems, typically at different sites. An enterprise typically wants to share data throughout the enterprise. A technique called replication is used to share data among multiple database management systems.
A replication system manages multiple copies of data at one or more sites, which allows the data to be shared. Data may be replicated synchronously or asynchronously. In synchronous data replication, a two-phase commit technique is used. In a two-phase commit, a transaction is applied only if all interconnected distributed sites agree to accept the transaction. Typically it is preferable for all hardware components and networks in the replication system to be available at all times in for synchronous replication.
Asynchronous data replication allows data to be replicated, at least on a limited basis, and thus allows for system and network failures. In one type of asynchronous replication system, referred to as primary-target, all database changes originate at the primary database and are replicated to the target databases. In another type of replication system, referred to as update-anywhere, updates to each database are applied at all other databases of the replication system.
An insert, update or delete to the tables of a database is a transactional event. A transaction comprises one or more transactional events that are treated as a unit. A commit is another type of transactional event which indicates the end of a transaction and causes the database to be changed in accordance with any inserts, updates or deletes associated with the transaction.
In some database management systems, a log writer updates a log as transactional events occur. Each transactional event is associated with an entry in the log. Each entry in the log is associated with a value representing a log position.
When a replication system is used, a user typically specifies the types of transactional events which cause data to be replicated. In addition, the user typically specifies the data which will be replicated, such as certain columns or an entire row of a table. In some embodiments, the log writer of the database management system marks certain transactional events for replication in accordance with the specified types of transactional events. The replication system reads the log, retrieves the marked transactional events, and transmits the transactional events to one or more specified target servers. The target server applies the transactional events to the replicated table(s) on the target server.
Typically, there are many events in a database management system that are processed asynchronously which also need to adhere to order. In replication processing, since transactional events are usually posted in parallel by multiple users of the database, it is difficult to maintain the linear order of the posting of the events without processing the events serially.
Therefore, there is a need for a technique to maintain order. In addition, the technique should also maintain order while allowing events to be processed, if possible, in parallel.