The present invention relates to data storage management in database systems. More particularly, the present invention relates to the structure and use of buffer caches in database management systems.
Databases add data to and retrieve data from mass storage devices during normal operation. Unfortunately, such storage devices are typically mechanical devices such as disks or tape drives which transfer data only rather slowly. Thus, databases which must frequently access information stored on disks can be somewhat slow. To speed up the access process, some databases employ a "buffer cache" which is a section of relatively faster memory (e.g., RAM) allocated to store recently used data objects. Throughout the remainder of the specification, this faster memory will simply be referred to as "memory," as distinguished from mass storage devices such as disks. Memory is typically provided on semiconductor or other electrical storage media and is coupled to the CPU via a fast data bus. Because the transfer of data in memory is governed by electronic rather than mechanical operations, the data stored on the memory can be accessed much more rapidly than data stored on disks. In fact, the ratio of memory access speed to disk access speed is usually at least 10:1. That is, information stored in memory can be accessed at least ten times faster than the same information stored in a disk.
Because the buffer cache has a limited size, some method must be employed for controlling its content. Conventionally, data storage systems employ a "least recently used--most recently used" (LRU/MRU) protocol to queue data objects in the buffer cache. Every time a database operation accesses a data object in an LRU/MRU system, that object is moved to the head of the queue (i.e., it is the "most recently used" data object). Simultaneously, the data objects that have not been used are moved one step toward the end of the queue. Infrequently used objects thus migrate toward the end of the queue, and ultimately are deleted from the buffer cache to make room for new data objects copied from disks. Thus, if a request is made to access a data object not currently in the buffer cache (e.g., it is on a disk), that object is added to the cache and the data object at the bottom of the queue (i.e., the "least recently used" object) is deleted. In this manner, the most recently used data objects are the only objects stored in the buffer cache at any given time.
Unfortunately, this process of memory management is somewhat random and therefore frequently fails to make the most efficient use of the buffer space. For example, if a very frequently-used data object (a "hot" object) is accessed at regular but relatively lengthy intervals, the frequently-used data object may actually be deleted from the buffer cache before it can be reaccessed. Thus, that object must be recopied from a disk each time that it is used.
Conventional buffer cache memory management systems also have problems in the way they store widely varying volumes of data. The buffer cache is typically divided into a plurality of storage blocks, each of equal storage capacity (e.g., 2 kilobytes). Unfortunately, if a large volume of data is copied to the buffer cache in one transaction, that data must be separately loaded into memory as small chunks sized to fit within the individual storage blocks. For example, assume a program must read a 2,000,000 kilobyte chunk of data. If the storage blocks are 2 kilobytes long, the computer has to do 1,000,000 I/O (input output) operations to copy the entire volume of data. This can considerably slow the operation of the database. On the other hand, if a buffer cache is divided into larger storage blocks (e.g., 64 kilobytes) it will accommodate larger volumes of data, thus reducing the number of I/O operations. Unfortunately, such larger storage blocks are inefficiently utilized when small volumes of data are copied from disks. For instance, reading a 2 kilobyte page into a 64 kilobyte storage block wastes 62 kilobytes of that block.
In view of the above problems, it would be desirable to more efficiently manage buffer cache memory. And in fact, some research has been conducted to identify better methods of memory management. However, these approaches have met with little success because they ultimately attained efficiency only by making the buffer memory space very large, and thus reducing the amount of memory available for other computational resources such as the operating system and individual programs.