In a typical relational database management system (RDBMS), all modifications to the database are logged in a redo stream (made up of redo records) to provide recovery and transaction durability. This redo stream (or redo log) can be used to drive asynchronous applications providing a variety of functionality. For example, the redo stream can be used to provide Logical Standby, in which a standby database shadows a primary database by extracting committed transactions out of the redo stream and applying those transactions. As another example, the redo stream can be used to provide Log-based replication, in which a replica site extracts committed changes made to the tables of interest in the database and applies the changes in order to keep the replica tables synchronized. As yet another example, the redo stream can be used to provide user query functionality, in which the redo stream is queried as though it were a relational table. In addition, the logical redo stream is platform independent and may be interpreted in any computer platform.
In one conventional application, the redo stream is analyzed to derive the equivalent data manipulation language (DML) statements that produced the redo stream. DML statements belonging to the same transaction are grouped together and committed transactions are provided to the application. Redo records typically only identify the modified schema objects or the associated columns with numbers generated internally to the database management system (DBMS). In order to perform log analysis and subsequent application of transactions, a data dictionary is needed to provide the mapping from the numbers to user-defined names. For example, SQL statements use column names and table names.
The organization of schema objects is not static. For example, columns may be dropped from or added to a table. Each new organization of a schema object defines a new version of the object. Since asynchronous log based applications may process a given portion of the redo stream multiple times and the organization of a schema object may change in the portion of the redo stream that must be reprocessed, the data dictionary required to do log analysis must represent multiple versions of the schema objects. Conventional log analysis applications could only process a given portion of the redo stream one time or would allow multiple passes over a given portion of the redo stream either by requesting that the data dictionary be completely reloaded before each pass (very expensive in terms of computing) or by accepting results that were missing some symbolic information.
In a relatively limited database system, the applications that process the redo stream are implemented in the same system as the database that generates the redo stream. However, in a more flexible distributed database system, the applications that process the redo stream are implemented in the database systems that are remote or distributed from the database that generates the redo stream. In such a distributed system, the redo stream from the database must be transmitted to one or more distributed database systems, upon which the applications that process the redo stream are implemented. In addition, in order to process the redo stream, the applications need to access a data dictionary that represents multiple versions of the schema objects. This requires the data dictionary to be maintained by replicating the multiple versions of the schema objects to data dictionaries in the distributed database systems.
A need arises for a technique by which the redo stream from the database may be transmitted to one or more distributed database systems, upon which the applications that process the redo stream are implemented, and by which the data dictionaries may be maintained by replicating the multiple versions of the schema objects to data dictionaries in the distributed database systems.