A database is made up of one or more database objects. Database objects are logical data structures that are used by a database server to store and organize both data in the database and procedures that operate on the data in the database. For example, in a relational database system, a table is a database object with data arranged in rows, each row having one or more columns representing different attributes or fields. Another database object in the relational database is a database view of certain rows and columns of one or more database tables. Another database object in the relational database is an index of values in a key column in a database table that points to the rows in the table having a particular value in the key column. Another database object in the relational database is a database trigger, a procedure that is executed in response to a particular event, such as the insertion of a row into a database table.
Data manipulation operations include adding a row, deleting a row, and modifying contents of a row, among others. Database definition operations include adding a table, adding a column to a table, and adding an index to a table, among others. Another database object in the relational database is a package of procedures that may be invoked and executed by the database server. A data dictionary is a table that stores information about the columns in each table in the database and the users who may access each table.
An object-relational database allows users to define classes and to generate, store and retrieve, in a relational database, multiple user-defined objects based on each class. An object is said to be an instance of the class that defines it. A class has a name that uniquely identifies the class, one or more elements (“attributes”) to hold values that describe an object of the class, and one or more methods to perform on the values in the attributes. Each attribute has a type.
Database systems include built-in types. Built-in types vary with different database systems, but typically include a logical type (holding one of two values corresponding to “true” and “false”), several integer types (each having one of several pre-defined lengths), several floating decimal point number types (each having one of several lengths), a character type, a string type (made up of one or more characters), and a date (including numbers representing year, month, day, hour, minute, second).
A class and the objects instantiated from the class have a type that is made up of the combination of the types of the attributes in the class. A user-defined object has a user-defined object type that is not a built-in type. For example, a user-defined class named “Street_Address” has a “Street_Number” attribute that is a short integer built-in type, a “Street_Name” attribute that is a character string built-in type, and an “Apt_Number” attribute that is another string built-in type. The user-defined object type is named for the class; the user-defined object type is a “street_address” type that is a combination of built-in types short integer, string, string. The street_address type is not a built-in type.
One user-defined class may have another user-defined class as an attribute. For example, a user-defined class named “Address” has a Street_Address attribute that is a street_address type, a “City_Name” attribute that is a string built-in type, a “State_Name” attribute that is another string built-in type, a “Zip_Code” attribute that is a short integer built-in type, and a “Country” attribute that is another string built-in type. The object type “address” is a combination of types street_address, string, string, short integer, string. Expanding the street_address type into the built-in types of its elements, the object type “address” is a combination of built-in types short integer, string, string, string, string, short integer, string. The address type is not a built-in type.
The object-relational database allows user-defined objects to be stored in a database object, such as a table. For example, the object-relational database allows an employee table to store Street_Address objects of street_address type. In a table, a column type indicates the type of data stored in the column. Thus, the employee table includes a particular column having a column type of the street_address type for storing objects of the street_address type.
Data in a database is often shared among many users for multiple applications. For example, data in an employee database of a multinational corporation is shared among corporate officials and personnel for accounting, payroll and human resources departments, each running a different application program that uses data in the database. The applications send queries to a common database server. Based on the queries, the database server retrieves data from the database or changes the database—such as by adding, deleting or modifying the data in the database objects, or by adding, deleting or modifying the structure of the database objects themselves.
In many circumstances, it is advantageous to copy some or all of the database objects constituting the database to multiple sites on a network. Replication is the process of copying and maintaining database objects in multiple databases that make up a distributed database system. Changes applied at one site are captured and stored locally before being forwarded and applied at each of the other sites. Alternatively, changes applied at one site cal also be synchronously forwarded and applied at each of the other sites. The process for propagating to other sites the changes that are made to replicated database objects, and then making corresponding changes to the copy of the objects at the other sites, is called convergence or synchronization. Replication provides a user at any site with fast, local access to shared data. Replication also enhances availability of the database and the applications that employ the database because, if one site goes down, the database at another site can be accessed for data retrieval and for updating.
When replicating a database onto a new node that does not have a copy of the database, a problem arises when the database contains a user-defined type for user-defined objects. Conventional procedures invoked for replicating database objects onto the new node do not allow for user-defined types within a database object.
Conventional replication of a database object on a new node typically involves (1) obtaining the data dictionary for the database object, (2) creating the data structures of the database object on the new node based on the data dictionary, and then (3) moving data into the data structures on the new node. When the data dictionary indicates a database object includes a user-defined type for a user-defined object, the conventional routines are unable to proceed because the user-defined type is not yet defined on the new node. The data received at the new master site is used by a database server on the new master site to create the database objects of the replication group at the new master site. According to the conventional replication, there is no data in the data structure on the new master site that defines the user-defined type. The database server on the new master site uses the data structure for user-defined types on the new master site to determine the meaning of user-defined types. Therefore, the database server on the new master site is unable to interpret the name for the user-defined type included in the data received for the data dictionary. The database server on the new master site cannot create the database object nor fill it with data. Replication of the database object fails. Thus, replication fails for a database if one database object in the database includes a user-defined type.
Even if the replication did not fail because of the new master site's inability to interpret the name for the user-defined type, other problems still may arise. For example, code already written for marshalling contents from columns with built-in types into a series of bytes for transfer to the new master site may be extended to marshal contents from columns with user-defined types. Code written for un-marshalling the series of bytes received at the new master site would also be extended to un-marshal the bytes received into contents for columns with the user-defined types. The user-defined types may be arbitrarily complex, with user-defined types based on other user-defined types, with repeated columns with the same user-defined types, with references and with backpointers to cyclic structures. The code to extend the marshalling and un-marshalling can become quite complex. Executing such complex code can significantly slow the process of replicating database objects that include user-defined types. Furthermore, producing such complex code is expensive and prone to error.
Another problem is that database commands for inserting and updating rows in a table with user-defined columns can become complex. For example, consider a Structured Query Language (SQL) statement that inserts into a table named “Employee” a value in a column named “Home” of user-defined type “Address” which includes an embedded user-defined type of “street_address.” If the data to be inserted into the row does not include a street_address, then an SQL insert statement may take the form of                INSERT INTO EMPLOYEE (Home) VALUES (Address(null, ‘San Francisco’, ‘CA’, ‘94000’)).        
This SQL statement causes the database system to invoke a type constructor process for the user-defined type “Address” and passes the constructor values for the four attributes of the Address type (Street_Address, City_Name, State_Name, and Zip_Code). Because no value is inserted for the Street_Address, a type constructor for the user-defined type “street_address” is not needed. However, if the data to be inserted included a street address, then the SQL statement would take the form of                INSERT INTO EMPLOYEE (Home) VALUES (Address(Street_Address(123, ‘Main Street’, null), ‘San Francisco’, ‘CA’, ‘94000’)).        
This SQL statement causes the database system to also invoke a type constructor process for the user-defined type “Street_Address” and passes the constructor values for the three attributes of the street_address type (Street_Number, Street_Name, and Apt_Number).
For rows with many user-defined types, each with embedded user-defined types that may include further embedded user-defined types down several layers, the SQL statement can become quite complex as a result of a combinatorial explosion. When the change caused by this insert is replicated to the new master site, additional complex code at the new master site is invoked to generate and execute the complex SQL statement to reproduce the insert at the new master site. Generating such long SQL statements and executing them can significantly slow the process of replicating changes to database objects that include user-defined types. Furthermore, producing such complex code is expensive and prone to error. Among other functions, the complex code must distinguish between a null for a built-in type and an atomic null for a user-defined type. Software code that already performs replication for database objects with built-in types provides a copy of the database objects with attributes that have the same names on the new master site as on the original master site. In some circumstance it is desirable that the attributes have different names on the new master site. For example, an application developed for users at the new master site may use a column called “EmpID” of type string for an employee identification number. At the old master site the same information may be in a column called “empno” of type string. Rather than change the application used at the new master site, it is advantageous to replicate the data in column “empno” at the old master site into the column “EmpID” at the new master site.
Based on the foregoing, there is a clear need for techniques that provide the benefits of peer-to-peer replication for object-relational databases.
Furthermore, there is a clear need for techniques that replicate database objects with user-defined objects while avoiding the costs of producing complex code to marshal and un-marshal arbitrarily complex user-defined types and avoiding the low performance caused by executing such code.
In addition, there is a clear need for techniques that replicate database objects with user-defined objects while avoiding the combinatorial explosion of SQL statements.
There is also a clear need for techniques that replicate database objects with different attribute names on the new master site than at the old master site.