1.0 Field of the Invention
This invention relates to a database management system; and in particular, this invention relates to online repair of a replicated table.
2.0 Description of the Related Art
Database management systems allow large volumes of data to be stored and accessed efficiently and conveniently in a computer system. In a database management system, data is stored in database tables which organize the data into rows and columns. FIG. 1 depicts an exemplary database table 24 which has rows 26 and columns 28. To more quickly access the data in a database table, an index may be generated based on one or more specified columns of the database table. In relational database management systems, specified columns are used to associate tables with each other.
The database management system responds to user commands to store and access data. The commands are typically Structured Query Language (SQL) statements such as SELECT, INSERT, UPDATE and DELETE, to select, insert, update and delete, respectively, the data in the rows and columns. The SQL statements typically conform to a SQL standard as published by the American National Standards Institute (ANSI) or the International Standards Organization (ISO).
Departments within an enterprise may have their own database management systems, typically at different sites. An enterprise typically wants to share data among the departments throughout the enterprise. A technique called replication is used to share data among multiple database management systems.
A replication system manages multiple copies of data at one or more sites, which allows the data to be shared among multiple database management systems. Data may be replicated synchronously or asynchronously. In synchronous data replication, a two-phase commit technique is used. In a two-phase commit, a transaction is applied only if all interconnected distributed sites agree to accept the transaction. Typically all hardware components and networks in the replication system must be available at all times in synchronous replication.
Asynchronous data replication allows data to be replicated on a limited basis, and thus allows for system and network failures. In one type of asynchronous replication system, referred to as primary-target, all database changes originate at the primary database and are replicated to the target databases. In another type of replication system, referred to as update-anywhere, updates to each database are applied at all other databases of the replication system.
An insert, update or delete to the tables of a database is a transactional event. A transaction comprises one or more transactional events that are treated as a unit. A commit is another type of transactional event which indicates the end of a transaction and causes the database to be changed in accordance with any inserts, updates or deletes associated with the transaction.
In some database management systems, a log writer updates a log as transactional events occur. Each transactional event is associated with an entry or record in the log; and each entry in the log is associated with a value representing its log position.
When a replication system is used, a user typically specifies the types of transactional events which cause data to be replicated. In addition, the user typically specifies the data which will be replicated, such as certain columns or an entire row. In some embodiments, the log writer of the database management system marks certain transactional events for replication in accordance with the specified types of transactional events. The replication system reads the log, retrieves the marked transactional events, and transmits the transactional events to one or more specified target servers. The target server applies the transactional events to the replicated table(s) on the target server.
A replicated table may need to be repaired under certain circumstances. For example, a replicated table may need to be repaired if it was taken out of replication for some duration of time, if some of the rows of that table failed to be replicated due to errors, or if the table was newly added into the replication topology and a user wants to bring the table up-to-date. In some replication systems, replication is stopped to repair a table.
Various database management systems operate in a non-stop environment in which the client applications using the database management system cannot be shut down. However, stopping replication may stop the database management system and therefore the client applications. Thus, there is a need for a technique to repair a replicated table without causing downtime to the client applications in the replication environment. The technique should repair a replicated table without requiring replication to be stopped.