The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
A clustered database system that runs on multiple computing nodes offers several advantages, such as fault tolerance and/or load balancing, over a database system running on a single computing node. In some example embodiments, a clustered database system includes a plurality of database servers or “instances” that share resources, including a database. FIG. 1 depicts an example clustered database system comprising database instance 100 and database instance 126 that share primary persistent storage 138. Although the example of FIG. 1 depicts two database instances, in some example embodiments, a clustered database system may include more than two database instances.
Database instance 100, 126 may be a collection of memory and processes that interact with data stored on primary persistent storage 138. Database instance 100 and database instance 126 may collectively implement server-side functions of a database management system. To ensure data consistency, each database instance of a clustered database system may acquire mastership of one or more resources. Referring to FIG. 1, set of data 140 and set of data 142 are stored on primary persistent storage 138. Thus, database instance 100 may be a master database instance for set of data 140, and database instance 126 may be a master database instance for set of data 142. Modifying particular data involves obtaining permission from the master database instance of the particular data. Thus, modifying set of data 140 involves obtaining permission from database instance 100, and modifying set of data 142 involves obtaining permission from database instance 126.
Primary persistent storage 138 may be one or more systems that store data structures in files, such as data blocks. For example, primary persistent storage 138 may include a virtual disk and/or one or more physical disks. Data stored on primary persistent storage 138 survives system failure. However, retrieving the data is typically a relatively slow and computationally expensive process.
For efficient data access, a database system typically maintains one or more caches of data in volatile memory, such as main memory or random-access memory. In the example of FIG. 1, database instance 100 includes volatile memory 102, and database instance 126 includes volatile memory 128. Volatile memory 102 and volatile memory 128 may be the same volatile memory of a single computing device or separate volatile memories of separate computing devices.
Referring to FIG. 1, volatile memory 102 includes primary cache 104, and volatile memory 128 includes primary cache 130. Database instance 100 stores set of data 108 in primary cache 104, and database instance 126 stores set of data 134 in primary cache 130. In some example embodiments, each database instance may maintain a respective primary cache of data for which the database instance has become a master database instance. Thus, set of data 140 may be stored as set of data 108 in primary cache 104, and set of data 142 may be stored as set of data 134 in primary cache 130.
Increased efficiency of data access may be achieved based on increasing the amount of data that can be cached. However, adding volatile memory to a database system may be cost-prohibitive. Thus, a cost-effective alternative is to supplement volatile memory with relatively inexpensive forms of low-latency non-volatile memory, such as flash memory or any other solid-state drive (SSD).
In FIG. 1, secondary persistent storage 112 is an example of non-volatile memory that is used to supplement volatile memories 102, 128. Like primary persistent storage 138, secondary persistent storage 112 is shared by database instances 100 and 126. Secondary persistent storage 112 may be partitioned into a plurality of secondary caches, such as secondary cache 114 and secondary cache 120. Each database instance may maintain a respective secondary cache of data for which the database instance has become a master database instance. Thus, set of data 108 may be stored as set of data 116 in secondary cache 114, and set of data 134 may be stored as set of data 122 in secondary cache 120.
In some example embodiments, a secondary cache may serve as an extension of a primary cache. Typically, lower priority data is moved from a primary cache to a secondary cache. Examples of lower priority data include data that is accessed with a relatively lower frequency, data that is relatively older, and data that is stored at a higher compression level. To track data that has been cached to the secondary cache, header information is stored in volatile memory. The header information is read and consulted to retrieve the data stored in the secondary cache. FIG. 1 depicts set of header data 106 and set of header data 132 as being stored in primary cache 104 and primary cache 130, respectively. However, in some example embodiments, header data may be stored outside of a primary cache in volatile memory.
When a database instance fails, data stored in volatile memory may be lost. This data includes header data. In contrast, data stored in non-volatile memory typically survives any failure. However, the data stored in non-volatile memory is inaccessible without access to corresponding header data.
During instance recovery, the secondary cache is completely repopulated, even though valid data there had survived. This is because the header data that may be used to determine what valid data is in the secondary cache is not available. Unfortunately, repopulating a cache involves a significant amount of time, and in the interim, data access may exhibit decreased throughput and increased response times, for example, due to data retrieval from primary persistent storage 138.
Thus, an approach for quickly recovering data stored in a non-volatile memory cache is beneficial and desirable.
While each of the drawing figures depicts a particular embodiment for purposes of depicting a clear example, other embodiments may omit, add to, reorder, and/or modify any of the elements shown in the drawing figures. For purposes of depicting clear examples, one or more figures may be described with reference to one or more other figures, but using the particular arrangement depicted in the one or more other figures is not required in other embodiments.