1. Field of the Invention
This invention relates in general to database management systems performed by computers, and in particular, to replicating updates in original temporal order in parallel processing database systems.
2. Description of Related Art
Relational DataBase Management Systems (RDBMS) store data into tables. A table in a relational database is two dimensional, comprising rows and columns. Each column has a name, typically describing the type of data held in that column. As new data is added, more rows are inserted into the table. Structured Query Language (SQL) statements allow users to formulate relational operations on the tables.
In the Teradata(copyright) RDBMS sold by NCR Corporation, the assignee of the present invention, tables in the relational database are often partitioned, i.e., the rows for a table are distributed among multiple processors and data storage devices. The partition is usually a horizontal distribution, wherein a table will have all of its rows spread between multiple processors.
However, such partitioning creates problems for replicating the table. For a partitioned table to be replicated, one or a set of those columns in a row must be designated as a unique primary key. The unique primary key definition for a table must be the same on the primary system and all subscriber systems 100 for that table. A primary system generates the SQL statements to update the tables, wherein the updates are then propagated to one or more subscriber systems.
When a transaction on the primary system updates a table that is designated as a replicated table, the changes need to be sent to the subscriber systems. These updates may comprise inserted, changed, and deleted rows. The problem that needs to be solved is that the updates need to be applied on the subscriber systems in the correct sequence.
At first, this might seem like a simple problem, but it can be rather complicated in a parallel processing environment. While there have been various techniques developed for replicating databases, there is a need in the art for improved techniques that replicate databases in a parallel processing environment.
The present invention discloses a method, apparatus, and article of manufacture for replicating modifications made to a subject table from a primary system to a subscriber system. A subject table is partitioned across a plurality of processors in both the primary and subscriber systems, wherein each of the processors manages at least one partition of the subject table. Change row messages are generated for each modification made to a subject table, wherein the change row message identifies the processor in the primary system making the modification, and includes a sequence number for the processor in the primary system. The processor in the primary system identified in the change row message is re-mapped to a new processor in the subscriber system and the sequence number for the processor in the primary system identified in the change row message is re-mapped to a new sequence number for the new processor in the subscriber system, so that the modifications are applied in a correct order on the subscriber system.
An object of the present invention is to optimize the database access on parallel processing computer systems. Another object of the present invention is to improve the performance of database partitions managed by a parallel processing computer systems.