1. Field of the Invention
This invention relates generally to the redo logs used in database systems to log the transactions performed by the database system and more specifically to the replication of procedure calls using the redo logs at a standby database.
2. Description of related art
Nowadays, most businesses, governments, and large organizations are completely dependent on their database systems. If the database system fails, the organization cannot operate. Because organizations depend so completely on their database systems, the database systems must be reliable. One way in which reliability is achieved in database systems is careful design to reduce hardware and software failures; another is redundancy of hardware and data so that hardware and software failures do not result in loss of data or of service; still another is recoverability, so that when a failure does occur, the database system can be restarted without loss of data. A technique that is commonly used to achieve recoverability is logging; whenever the database system performs a transaction, it logs the results of the operations making up the transaction in a file. The result of the logging operation is a transaction log that records operations belonging to a stream of transactions performed by the database system. When a failure occurs, the transactions in the stream that were performed up to the point of the failure can be recovered by redoing the operations specified in the log file. For this reason, such transaction logs are often termed redo logs.
While redo logs were originally developed to permit recovery from failures, both the designers of the database systems and their users soon realized that the information contained in the logs could be put to other uses. There are two broad categories of such uses: data mining and replication. Data mining takes advantage of the fact that a redo log necessarily contains a complete record over the period during which the redo log was made of the operations performed by the database system on the data stored in the database system. One use of such information is to tune the database system for more efficient performance; another is to analyze the kinds of transactions being made by users of the database system over a particular period. For example, if the database system keeps track of the sales of items of merchandise, the redo log could be examined to see whether a TV promotion of a particular item of merchandize had any immediate effect on sales of the item.
Replication is an extension of the original purpose of the redo log. When a redo log is used for recovery, what is actually done is that the database system is put into the condition it was in at the point at which the redo log begins and the operations that are recorded in the redo log are replicated in the database system. In the same manner, the redo log can be used to propagate changes to other database systems. For example, if an organization has a main personnel database system at headquarters and local personnel database systems at various branches, the redo log from the main database system can be used to replicate the operations performed at the main database system in each of the branch database systems, so that what is in the local database systems continues to correspond to what is in the headquarters personnel database system.
Originally, the information in the redo logs was copied from the database system at an extremely low level. For example, in relational database systems, the data in the database systems is organized into tables. Each table has a name by which it is known in the database system. Each table further has one or more named columns. When the table contains data, the table has one or more rows, each of which contains fields corresponding to each of the columns. The fields contain data values. The data base system's tables are in turn defined in other tables that belong to the database system's data dictionary. To perform an operation in a database system, one specifies the operation in terms of table names and column names. The actual data specified in the tables is, however, contained in data blocks in the database system, and whenever data was changed in the database system, a copy of the changed data block was written to the redo log.
Someone needing to mine logs could begin with a copy of data from a redo log and use information from the data dictionary to determine what table the changed data belonged to and from the kind of change what kind of database operation had been performed, but doing so was time consuming and mistake prone. Also, the fact that the changes were recorded at the data block level meant that the redo log could be used for replication only in database systems that were substantially identical to the one in which the redo log had been made.
To make redo logs easier to use for data mining and replication, database system designers developed software to mine the redo logs to produce a sequence of equivalent logical operations in memory. This process not only indicates what change had been made, but also described the operation in terms of a query language command and the names of the tables and columns affected by the operation. This makes redo logs much easier to analyze, and as long as a particular database system can perform the logical operations, a replica of a set of changes can be made in the particular database system.
FIG. 1 illustrates a system that uses the redo log for replication. The system includes a primary database 101 and a standby database 106. The primary database 101 is a production database that functions in the primary role. This is the database that is accessed by most applications that use the system. The standby database 106 is a transactionally consistent copy of the primary database 101. The primary database 101 and the standby database 106 can be at any locations, as long as they can communicate with each other. Stream of transactions performed on a primary database 101 are recorded in a redo log 102. The data in the redo log 102 are transmitted in a redo stream 103 to the standby database 106. A LogMiner utility 104 at the standby database 106 processes the redo log 102 and produces a sequence of equivalent logical operations 105 in memory. The logical operations 105 are then applied to the standby database 106.
While the use of logical operations have made the replication of a database much easier, problems still remain. First, many administrative functions performed at the primary database 101 are typically accomplished through the use of procedures. A procedure is a schema object that includes SQL statements and constructs that execute as a unit to solve a specific problem or perform a set of related tasks. Procedures permit the caller to provide arguments that can be input only, output only, or input and output values. Procedures are not restricted to administrative purposes but may also include business procedures and consequently be an integral part of business applications. Changes caused by a procedure typically span over multiple transactions. These transactions should either be applied or not applied at the standby database 106 as a whole. However, only changes to individual data blocks are recorded in the redo log 102. The standby database 106 would have no knowledge that a set of changes are associated with the same procedure. Thus, if there was a failure before the procedure completed at the primary database 101, there is no reliable way to rollback from the failed procedure as a whole. Second, the changes caused by the procedure and recorded in the redo log 102 may refer to local metadata, which may not be identical at the standby database 106. So the re-creation of the individual logical operations 105 generated from the redo log 102 may not lead to the same result at the standby database 106.
It is an object of the techniques disclosed herein to solve these and other problems of redo logs and replication of procedures at standby databases. These techniques are equally applicable to both administrative and business procedures.