Many software systems such as information retrieval systems, database engines, and database management systems (DBMSs) use a buffer pool or buffer cache to store recently accessed data. In these systems, buffer pool sizes are relatively large. Capacities in the 100 GB range are common. A buffer pool in such software systems comprises a number of individual fixed sized pages. The DBMS manages data in the database and the buffer pool by these individual fixed size pages.
As the database is referenced during processing of user requests, pages of the database are read from one or more disks storing the database and are cached to the buffer pool once the data in a page becomes accessed. The buffer pool may contain “clean” pages which have not been modified to memory after having been read from disk and may contain “dirty” pages which include modifications to the database in the buffer pool. When the buffer pool is shutdown, dirty pages (that is, data contained in the dirty pages) must be written to disk or other persistent storage in order to maintain the data modifications contained in those pages. Typically, existing software is faced with two problems associated with shutdown and startup of buffer pools. The first problem arises because a buffer pool is managed at the granularity of a page: the pool consists of pages in memory that are most likely not from contiguous disk locations. As a result, when a large percentage of pages are dirty, saving such pages is inefficient because saving dirty pages to disk may require writing to random or non-sequential offsets, making more frequent disk subsystem I/O effort, and the pages in the buffer pool may become saved in a non-contiguous fashion to the persistent disk storage.
The second problem results from a loss of information when the buffer pool is shutdown. A populated buffer pool contains implicit information by virtue of the pages that are cached to the buffer pool at any point in time. The set of pages that are cached in the buffer pool at some point in time represents the set of pages that the DBMS considers most likely to be reused and thus most worthy of caching at that point in time. When a buffer pool is shutdown, this cached information may be lost which is highly undesirable.
When restarted, a buffer pool management sub-system takes time to relearn which pages are the most worthy of caching to the buffer pool. This relearning effort may take a significant amount of time. As a result, the first accesses to the database will be penalized as it is necessary to read the referenced pages from disk rather than from the buffer pool. Thus, the application that needs the data will have to wait longer in comparison to the situation in which the buffer pool had already cached the desired page from persistent storage.
A further problem common to buffer pool starts and restarts is buffer pool allocation. Typically, a buffer pool is not made available for storing pages to or retrieving pages from the buffer pool until the portion of the buffer pool configured for storing pages is completely allocated in memory. This allocation may unnecessarily delay the start (or restart) of the information retrieval system using the buffer pool.
A solution to some or all of these shortcomings is therefore desired. What is therefore needed is a system, a computer program product, and an associated method for maintaining cached information during shutdown and restart. The need for such a system has heretofore remained unsatisfied.