In-memory databases are developed to take full advantage of modern hardware to increase performance. By keeping all relevant data in main memory (e.g., Random Access Memory (RAM)), data processing operations can be significantly accelerated.
More and more enterprises and institutes apply in-memory databases to process and analyze big data. Since all relevant data is kept in memory, an in-memory database is no doubt superior to a traditional database in terms of operation speed. However, keeping all data in memory can also raise a problem when the available memory size (RAM) is less than the total data size of the database. For example, in such a case, less than the entire database is loaded into the memory at a time. Memory replacement techniques are generally used in such situations. For example, a portion of the database that is currently loaded into the memory is exchanged for another portion of the database that is requested, but is not currently loaded into the memory.
Many existing memory replacement algorithms are based on a paging technique, which mainly deals with memory management on an operating system level. The memory and a program's address space are divided into page frames. If the requested pages are not in the main memory, these pages will be swapped into the main memory for execution. However, this technique does not fit for a column-based in-memory database, which doesn't use pages. To improve the data scanning speed for column-based systems, in-memory databases often keep each column in a continuous address space. Consequently, memory swapping/replacement is based on data columns, which are usually much larger than page frames. In such cases, traditional page-based memory replacement techniques often do not work.