Information can be important and valuable and procedures are undertaken to protect it. In a procedure referred to as data replication, modern enterprises replicate data that is primarily updated and or accessed at a storage system, referred to herein as a “primary data system” (sometimes called a source data system), is replicated or duplicated at another storage system or location, referred to herein as “replica data system.” The data stored at the primary system is referred to herein as primary data or a primary copy and the data stored at the replica system is referred to as replica data or a replica copy.
Database systems (DBMSs) are often protected using replication. Typically, one DBMS maintains the primary copy of a database and another database system, referred to herein as a standby database, maintains a replica of the primary copy. The standby database system is used to back up (or mirror) information stored in the primary database system or other primary copy.
For a DBMS protected using replication, data files, redo log files and control files are stored in separate, logically or physically identical images on separate physical media. In the event of a failure of the primary database system, the information is preserved, in duplicate, on the standby database system, which can be used in place of the primary database system.
The standby database system is kept up to date to accurately and timely reproduce the information in the primary database system. Typically, archived redo log records (“redo records”) are transmitted automatically from the primary database system to the standby database system. Information from the redo logs is used to replicate changes on the primary database system to the standby database system.
There are two types of standby database systems, a physical standby database system and logical standby database systems, which differ in the way they archive information. In a physical standby database system, changes are made using physical replication. Under physical replication, updates made to a data unit of contiguous storage (herein “data unit”) at the primary data system are made to corresponding data unit replicas stored at the replica system. In the context of database systems, changes made to data blocks on the primary database system are replicated in replicas of those data blocks on the physical standby database system.
A data block is an atomic unit of persistent contiguous storage used by a DBMS to store database records (e.g. rows of a table). Thus information stored on the primary database system is thus replicated at the lowest atomic level of database storage space and a physical standby database system is essentially a physical replica of the primary database system. When records are read from persistent storage, a data block containing the record is copied into a buffer of DBMS's buffering system. The buffer usually contains many other rows and control and formatting information (e.g., offsets to sequences of bytes representing rows or other data structures, lists of transactions affecting rows).
To replicate changes from the primary database system, the standby database system scans the redo records generated for the primary database system. Redo records record changes to data blocks between a previous version of a data block and a subsequent version of the data block. A redo record contains enough information to reproduce the change to a copy of the previous version. Using information in a redo record to reproduce a change recorded by the record to a copy of the previous version of the data block to produce the subsequent version of the data block, is an operation referred to herein as applying the redo record.
Another approach to replicating data is that of the logical standby database system. With the logical standby database system approach, DBMS commands that modify data on the primary system are in effect re-executed on a logical standby database system to essentially duplicate the changes made to the primary database. While executing the same DBMS commands guarantees that changes are replicated at the transactional level, the changes are not replicated at the data block level.
For various reasons, this change in replication strategy allows a logical standby database system to be available for reporting applications while replication is being performed. The ability to support reporting with data known to be fresh and without having to lag in updates is beneficial because it allows use to be made of the redundant hardware instead of just having it serve singly as a backup system.
Physical standby database systems typically have comparatively high performance, relative to logical standby database systems. However, a physical standby database system is unavailable for read operations unless application of redo records from the primary database system is stopped. However, delays in updating the physical standby database system, such as are caused by stopping application of redo while the standby database is open for reads, may allow it to lag to a greater degree from the primary database system. This concomitantly exposes fresh data in a primary database system to the risk of corruption or loss without adequate archived backup information available. Delays in updating the physical DBMS also introduce the possibility of read operation on data that may increasingly be at variance from the primary database system.
Thus, there is a need to allow read operations to physical standby databases in a way that eliminates or minimizes delays to updating the physical standby database system.
Physical replication in other contexts has similar limitations and needs. For example, in remote mirroring, data on a primary disk volume or file system is replicated on replicas of the primary disk volume or file system. The data units physically replicated are sectors, which are the atomic units of contiguous store read and written to disk. The data units of a primary disk volume or file system are replicated on the replicas of the disk volume and file systems. The changes to data units are recorded by change descriptions. Information about changes in a change description is used to reproduce the changes to copies of the sectors.
Therefore, there is a need for any system using physical replication to allow read operations to replicas to be performed in a way that eliminates or minimizes delays to updating the replica with changes to primary data.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.