Data is often collected and stored in databases. Access to a database is managed by a database management system, such as a relational database management system (DBMS or RDBMS). To retrieve or update data in a database, database queries, such as Structured Query Language (SQL) queries, are submitted to the database management system.
A database typically includes multiple tables, where each table contains data arranged in rows and columns. In large databases, tables can be relatively large, with some tables having tens of millions to hundreds of millions of rows. In a database management system, tables are stored in persistent storage, which is usually implemented with large disk-based storage devices. Persistent storage usually has slower access speeds than non-persistent storage, such as volatile memory in the database management system. However, due to higher costs associated with higher-speed memory, the storage capacity of higher-speed memory in a database management system is usually much smaller than the storage capacity of persistent storage.
When performing a database operation, portions of tables are retrieved from persistent storage into the memory of the database management system to allow for faster processing. As the memory fills up during the database operation, data portions in the memory are replaced with new data portions retrieved from the persistent storage. Often, the replacement strategy for replacing data portions in memory with new data portions is a least-recently-used (LRU) replacement strategy (or some variation thereof) provided by some operating systems or implemented inside the DBMS software, which are part of database management systems. The LRU replacement strategy removes least-recently-used data from the memory for replacement with new data.
A database operation, such as a join operation that joins multiple tables to produce a result, often involves repeated accesses to certain rows of one or more tables. Thus, as a database management system proceeds through a database operation, and rows that are in memory are replaced with other rows, the replaced rows may have to be later read back from persistent storage into the memory. Repeated retrievals of the same pieces of data from persistent storage to memory, especially if such repeated retrievals occur often, will result in increased I/O (input/output) accesses that can reduce performance of the database management system.