1.0 Field of the Invention
This invention relates to a database management system; and in particular, this invention relates to support for schema evolution in multi-node peer-to-peer replication environment.
2.0 Description of the Related Art
Database management systems allow large volumes of data to be stored and accessed efficiently and conveniently in a computer system. In a database management system, data is stored in database tables which organize the data into rows and columns. FIG. 1 depicts an exemplary database table 24 which has rows 26 and columns 28. To more quickly access the data in a database table, an index may be generated based on one or more specified columns of the database table. In relational database management systems, specified columns are used to associate tables with each other.
The database management system responds to user commands to store and access the data. The commands are typically Structured Query Language (SQL) statements such as SELECT, INSERT, UPDATE and DELETE, to select, insert, update and delete, respectively, the data in the rows and columns. The SQL statements typically conform to a SQL standard as published by the American National Standards Institute (ANSI) or the International Standards Organization (ISO).
Departments within an enterprise may have their own database management systems, typically at different sites. An enterprise typically wants to share data throughout the enterprise. A technique called replication is used to share data among multiple database management systems.
A replication system manages multiple copies of data at one or more sites, which allows the data to be shared among multiple database management systems. Data may be replicated synchronously or asynchronously. In synchronous data replication, a two-phase commit technique is used. In a two-phase commit, a transaction is applied only if all interconnected distributed sites agree to accept the transaction. Typically all hardware components and networks in the replication system must be available at all times in for synchronous replication.
Asynchronous data replication allows data to be replicated, at least on a limited basis, and thus allows for system and network failures. In one type of asynchronous replication system, referred to as primary-target, all database changes originate at the primary database and are replicated to the target databases. In another type of replication system, referred to as update-anywhere, updates to each database are applied at all other databases of the replication system.
An insert, update or delete to the tables of a database is a transactional event. A transaction comprises one or more transactional events that are treated as a unit. A commit is another type of transactional event which indicates the end of a transaction and causes the database to be changed in accordance with any inserts, updates or deletes associated with the transaction.
In some database management systems, a log writer updates a log as transactional events occur. Each transactional event is associated with an entry in the log.
When a replication system is used, a user typically specifies the types of transactional events which cause data to be replicated. In addition, the user typically specifies the data which will be replicated, such as certain columns or an entire row. In some embodiments, the log writer of the database management system marks certain transactional events for replication in accordance with the specified types of transactional events. The replication system reads the log, retrieves the marked transactional events, and transmits the transactional events to one or more specified target servers. The target server applies the transactional events to the replication table(s) on the target server.
A schema is a data structure that describes the columns of a table. A schema specifies the number of columns in a table and the type of data to be stored in the columns of the table. For example, a schema may specify that column one contains integer data, column two contains character data and column three contains floating point data. The schema is changed in order to change the number of columns and/or type of data stored in the columns of the table.
If the schema for a table needs to be altered or if the fragmentation strategy needs to be changed then replication is typically stopped, the table is altered, and replication is restarted. However, stopping replication may cause the client applications to shut down. Some database management systems operate in a non-stop environment and stopping replication may cause client applications to stop. In addition, when a schema of a table is altered, one participant of a replicate may be using the new schema which has a new format for a row of a replicated table, while other participants of the replicate may be using an older format of the row. Because the row formats are different, replication may stop. Therefore there is a need for technique to allow the schema of a replicated table to be changed without causing the client applications to stop.