Nearly all commercial database systems rely on caching techniques to improve performance. Caches are often implemented in memory that can be accessed quickly, such as random access memory (RAM), as opposed to storage that takes longer to access, such as disk-based storage. Caches typically store frequently used data and reduce the time needed by a database system to access a data page.
Database systems that support multiple nodes sharing stored data need to keep track of different versions of the stored data. This is because one node may modify shared data where other nodes require access to the original data. Some designs rely on locking to serialize reading and writing of the shared data, but this can be an inefficient use of resource. Thus, shared disk database systems have been developed that store different versions of data, where reads and writes from separate nodes can execute independent of each other, with each getting the correct data.
In order to handle this, in a system where data structures provide a hierarchical way of identifying pages, either in cache or storage, data structures are associated with version identifiers. Pages in cache are Accessed using the version identifier along with identifying information. In this system, pages in cache can include a version identifier to indicate the associated version of data structure that initially read or created the page in cache. If no version of the page in cache matches the requested version, the correct version of the page is read from disk-based storage into the cache.
In a system where a single cache is used, the version of unchanged pages is shared among all versions of the data structure. Each cached page is associated with a version range, with the lower bound set to the initial version when the page was read or created and the upper bound set to infinity. During a page lookup, if the requested version falls within the version range of a page in cache, that page is returned. Setting upper bound to infinity effectively allows the same page in cache to be shared by all higher versions. When a newer version of a page is created, the previous version of the page in cache is located and the upper bound of its version range is clipped from infinity to new version-1. This allows subsequent page lookup to return correct version of the page, based on whether the requested version falls within the version range of the previously cached page or the newly created page.
Cache retention algorithms allow all higher versions of a data structure to share the same cached page until a new version of the page is created. These work well in a single monolithic database server since all versions of a page are cached in one cache. When a new version of the page is being created in cache, the previous version of the page can be located and the upper bound of its version range can be clipped.
However the same cache retention algorithms do not work in a shared disk database cluster where each node maintains its own cache. When a new version of the page is created and cached in one node, the previous version of the page may be cached in one or more other nodes. Without the ability to locate previous version of the page in local cache to clip the upper bound of its version range when a new version is created, all higher versions of the data structure can no longer share the same unchanged page in local cache. This creates situations where the wrong page can be looked up from cache. For example, if a new version of a page is created in one cache but the previous version existed in a second cache. The old version in the second cache would maintain its upper bound set to infinity. Therefore, if a request came in for the modified page to the second cache after the update was made to the first cache, the second cache would return the wrong version because it was never informed of the change.
To address this issue, a shared disk database cluster must either force a shared disk database to always re-read database page from physical storage when the data structure's version does not match the version of the page in cache or use internode messaging to notify other nodes to clip upper bound of version range for the previously cached version when a new version of the page is created. These are both expensive and can result in excessive network I/O's or physical I/O's. The issue can have significant impact on performance, because some small and infrequent DML operations on a table can invalidate all of cached pages of the same table in all nodes.
In the drawings, like reference numbers generally indicate identical or similar elements. Additionally, generally, the left-most digit(s) of a reference number identifies the drawing in which the reference number first appears.