A database is made up of one or more database objects. Database objects are logical data structures that are used by a database server to store and organize both data in the database and procedures that operate on the data in the database. For example, in a relational database, a table is a database object with data arranged in rows, each row having one or more columns representing different attributes or fields. Another database object in the relational database is a database view of certain rows and columns of one or more database tables. Another database object in the relational database is an index. An index typically stores values from a key column in a database table, and points to the rows in the table that have a particular value in the key column.
Another database object in the relational database is a database trigger. A database trigger is a procedure that is executed upon an operation involving a database table. Data manipulation operations include adding a row, deleting a row, and modifying contents of a row, among others. Database definition operations include adding a table, adding a column to a table, and adding an index to a table, among others. Another database object in the relational database is a package of procedures that may be invoked and executed by the database server.
Data in a database is often shared among many users for multiple applications. For example, data in an employee database of a multinational corporation is shared among corporate officials and personnel for accounting, payroll and human resources departments, each running a different application program that uses data in the database. The applications send queries to a common database server. Based on the queries, the database server retrieves data from the database or changes the database—such as by adding, deleting or modifying the data in the database objects, or by adding, deleting or modifying the structure of the database objects themselves.
In many circumstances, it is advantageous to copy some or all of the database objects constituting the database to multiple sites on a network. Replication is the process of copying and maintaining database objects in multiple databases that make up a distributed database system. Changes applied at one site are captured and stored locally before being forwarded and applied at each of the other, remote sites. The application of the changes made at each site to each other site is a process called convergence or synchronization.
Replication provides a user at any site fast, local access to shared data. Replication also enhances availability of the database and the applications that employ the database because, if one site goes down, the database at another site can be accessed for data retrieval and for updating.
A group of database objects replicated together is called a replication group. Often a replication group is created for a subset of the database objects in one or more databases used to support a particular database application. One architecture for distributed databases involves multiple master sites, called peers, which each contain the same database objects in a master replication group, also called, simply, a master group. The database servers at each master site automatically work to propagate changes for all database objects in the master group to all the peers, in order to ensure transaction consistency and data integrity.
A problem noted with current distributed databases is that, after a set of master sites has been established, it is difficult to add another master site. The particular network node that is to be used as the new master site is incapable of processing the changes to the database objects being propagated by the extant master sites until after the database objects in the master group have been instantiated (i.e., created) on the particular node. Even then, the particular node cannot process the changes as a normal master site would do until all the data, in the database objects before those changes, have been loaded into the newly instantiated database objects on the particular node.
Consequently, when adding a new master site, replication of the master group of the distributed database is suspended (i.e., goes into a quiescent mode in which replication does not occur). Suspending replication activity for a master group is called quiescing the master group. Changes already made at any master node are propagated to the other master nodes before quiescing the master group. During a quiescent period, while replication is suspended, transactions that change the contents or structure of the database objects would lead to inconsistencies among the master nodes. Therefore, a system administrator makes the master group unavailable to a user before quiescing the master group. A user is not allowed to request any services from the database for the master group at any master site during the quiescent period. The quiescent period lasts until the new master site has all the database objects of the master group instantiated and loaded with data so that the master group on the new site is in the same state that the master groups on the other master sites were in at the start of the quiescent period. This quiescent period may last hours and even days for large databases.
Making a distributed database unavailable for a quiescent period is a severe problem for commercial applications. The distributed databases most likely to add a master site are those supporting applications with a fast growing pool of users distributed over a large area, often encompassing many time zones and consequently demanding operations around the clock. Such commercial applications often process orders that involve adding data to the database. The applications would have to suspend operations during the quiescent period each time a new master site is added to meet the growing demands. Each suspension of operations involves many lost orders and consequently significant lost revenue. In addition, there is a chance a user will be so dissatisfied that the user determines not to return as a customer of the enterprise providing the commercial application. The problem compounds as operations are suspended repeatedly as new master sites are added to accommodate growth.
Based on the foregoing, there is a clear need for a system that adds a new master site for a distributed database, by making a replica of the master group at the new site, without suspending database operations involving the master group at extant master sites.