In a typical relational database management system (RDBMS), all modifications to the database are logged in a redo stream (made up of redo records) to provide recovery and transaction durability. This redo stream (or redo log) can be used to drive asynchronous applications providing a variety of functionality. For example, the redo stream can be used to provide Logical Standby, in which a standby database shadows a primary database by extracting committed transactions out of the redo stream and applying those transactions. As another example, the redo stream can be used to provide Log-based replication, in which a replica site extracts committed changes made to the tables of interest in the database and applies the changes in order to keep the replica tables synchronized. As yet another example, the redo stream can be used to provide user query functionality, in which the redo stream is queried as though it were a relational table.
In one conventional application, the redo stream is analyzed to derive the equivalent data manipulation language (DML) statements that produced the redo stream. DML statements belonging to the same transaction are grouped together and committed transactions are provided to the application. Redo records typically only identify the modified schema objects or the associated columns with numbers generated internally to the database management system (DBMS). In order to perform log analysis and subsequent application of transactions, a data dictionary is needed to provide the mapping from the numbers to user-defined names. For example, SQL statements use column names and table names.
The organization of schema objects is not static. For example, columns may be dropped from or added to a table. Each new organization of a schema object defines a new version of the object. Since asynchronous log based applications may process a given portion of the redo stream multiple times and the organization of a schema object may change in the portion of the redo stream that must be reprocessed, the data dictionary required to do log analysis must represent multiple versions of the schema objects. Conventional log analysis application could only process a given portion of the redo stream one time or would allow multiple passes over a given portion of the redo stream either by requesting that the data dictionary be completely reloaded before each pass (very expensive in terms of computing) or by accepting results that were missing some symbolic information.
A need arises for a technique by which the data dictionary can represent multiple versions of the schema objects that provides improved performance, reduced computing costs, and more accurate results.