The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
A multi-node database management system (“DBMS”) is made up of interconnected nodes that share access to shared data resources. Typically, the nodes are interconnected via a network and share access, in varying degrees, to shared storage, e.g. shared access to a set of disk drives and data blocks stored thereon. The nodes in a multi-node database system may be in the form of a group of computers (e.g. work stations, personal computers) that are interconnected via a network. Alternately, the nodes may be the nodes of a grid. A grid is composed of nodes in the form of server blades interconnected with other server blades on a rack.
Each node in a multi-node database system hosts a database server. A server, such as a database server, is a combination of integrated software components and an allocation of computational resources, such as memory, a node, and processes on the node for executing the integrated software components on a processor, the combination of the software and computational resources being dedicated to performing a particular function on behalf of one or more clients. Among other functions of database management, a database server governs and facilitates access to a particular database, processing requests by clients to access the database.
Resources from multiple nodes in a multi-node database system can be allocated to running a particular database server's software. Each combination of the software and allocation of the resources from a node is a server that is referred to herein as a “server instance” or “instance”.
Transaction Processing
Like any multi-node computer, one or more of the nodes may fail. When a node fails, one or more of the surviving nodes performs failure recovery. In the database systems, this entails redoing or undoing certain changes to the database system. A redo log is scanned to determine which changes need to be redone or undone and how to redo or undo the changes.
A redo log contains redo records. Redo records record changes to a unit of data in a database (e.g. a row, a data block that stores rows) A redo record contains enough information to reproduce a change between a version of the unit of data previous to a change and a version of the unit of data subsequent to the change.
During failure recovery, much of the database is locked. Normal access to the database by the surviving nodes is prevented until it can be determined which units of data have changes that may need to redone or undone. Once this set is determined, the database is unlocked and normal access to unit of data that are not in the set is permitted.
Because the database is locked until the completion of the process of determining the set of units of data that have changes that need to be redone or undone, the completion of this process delays the full availability of the database system. Constant and complete availability of a database system is a critical feature of a multi-node database system. Therefore, there is a need to reduce the time is takes to determine the set of units of data that have changes to be redone or undone.