Many current database management systems (DBMSs) are designed under the assumption that the entire database is stored on disk, and only a small working data set fits in main memory. During query processing, data is often fetched from disk and brought into the in-memory working space, residing in main memory. Since in this model the disk input/output (I/O) is a major bottleneck, numerous optimizations have been proposed for its reduction.
The assumption that the major bottleneck is disk I/O holds today in systems where the main memory size is relatively small by comparison to the database size. As random access memory is becoming cheaper, however, computers are being built with increasing memory sizes. A result is that many DBMSs today have large enough main memory to contain low-end (e.g., relatively small) databases entirely.
During query processing, data is fetched in blocks and brought from main memory into a cache, typically a processor-resident cache. The size of a block, referred to as a cache line, usually varies depending on processor or system implementation in size between 32, 64, and 128 bytes. When a data item needs to be accessed, the processor first looks into its local cache. If the entry is not there, a cache miss occurs and the item is fetched from main memory.
Caches today are classified by “levels.” Typically, computer systems have multiple cache levels, such as L1, L2, and L3. Each level is designated as such by the order in which a processor will access the level (e.g., a processor will access an L1 cache prior to an L2 cache). Originally, only L1 caches resided on the semiconductor chip on which the processor was formed. However, today L2 caches also generally reside on the same semiconductor chip as the processor.
It has been shown that in commercial databases a significant component of the data access cost is the cost of fetching data into processor-resident L2 caches due to cache misses. The Pentium 4 processor, for example, spends as many as 150 clockticks to fetch an entry into the L2 cache while an instruction takes by definition one clocktick. Similarly to traditional databases where optimizations are made to reduce disk accesses, performance gain for in-memory databases can be achieved by reducing the number of cache requests from memory. By contrast, if a database resides mostly on disk the benefit of reducing cache misses is generally overshadowed by the relatively high cost of disk accesses.
A DBMS component that significantly impacts the performance of queries is the data index. The purpose of a data index is to facilitate efficient access to data items in records of the database. In the case of in-memory databases, the cost of index traversal is dependent on the number of cache lines fetched from memory.
Recently, a number of projects focused on the implementation of tree-based structures for in-memory data indexes that can perform well in main memory. In such tree-based data index structures, a tree is traversed, e.g., from a root node, through intermediate nodes, and to a leaf node containing a pointer to a record. Each node typically comprises keys that contain pointers to other nodes (e.g., root and intermediate nodes) or pointers to records (e.g., leaf nodes). The tree-based data index structure is searched in key lookups until a match is (e.g., or is not) found between a search key and a key in a leaf node. The leaf node points to a record containing information corresponding to the search key.
The cost of in-memory data retrieval depends on the height of the tree as well as the cost of key lookups in the nodes. By contrast, for disk-based accesses, the latter cost component was considered negligible. Most proposals for in-memory data indexes concentrate on reducing the cost of key searches in a node. There are a variety of proposed in-memory data indexes, but these proposed in-memory data indexes and the use thereof still could be improved.