Databases store data on secondary storage devices, for example, disks. The access time for data stored in memory is typically less than the time taken to access data from secondary storage. Secondary storage devices may require mechanical movement of parts to access data, for example, a seek operation. In contrast, there is no mechanical movement of parts to access data stored in memory, thereby making access to in memory data fast. However the cost per unit storage of memory is much higher than the typical cost per unit of storage on secondary devices. Therefore database systems typically have much less memory to store data than the amount of secondary storage available.
Data is typically fetched from the secondary storage to memory in units called blocks or database blocks for processing. Inefficient strategies for managing database blocks can result in inefficiencies in the database system. For example, if a process needs data for processing a query and the data is not available in memory, the process causes a block fault, i.e., the database system has to read the data from the disk. Meanwhile the process has to wait for the requested data to be available thereby slowing down the processing of the query.
Another problem occurs when there is no room in the memory to store the new database block being read. In this situation, the database system evicts a database block from the memory to make room for the new database block being read. If the database block evicts a database block that has been read into the memory has not been processed yet, the entire effort of reading the evicted database block in memory is wasted.
Another problem occurs if data used for processing a query is fragmented into chunks of data stored in different portions of a disk. If data is fragmented and stored in different portions of a disk, loading the data requires multiple seek operations. This slows down the access to the data. One conventional mechanism to improve performance of data accessed from disk is to perform a defragmentation operation that copies the blocks from their current locations to new locations and arranges the blocks of data in contiguous locations on the disk. However, this operation is typically executed manually by an operator. This increases maintenance overhead of the database system. Also the defragmentation operation can take a long time to execute since a very large amount of data is being copied. During this time the database system may have to suppress processing of user queries in order avoid synchronization issues.