The present invention relates to operating a database system and to the database system itself. In one embodiment, the invention provides a prefix-based memory allocator for paged leaf nodes of a search tree.
In database systems, a checkpoint is an administrative operation in which a database image is persistently stored for use in a possible future recovery of a shut down, or a failed server. In in-memory databases, a database image (a checkpoint image) is also instrumental in starting up the database server, because all the data has to be loaded into memory. Depending upon the implementation specifics, a checkpoint image may consist of complete indexes and storage pages (storing the user data), or may consist of storage pages only, or may consist of something between those two extremes. If the indexes are not included in the checkpoint image, they are called transient indexes, and they need to be recreated when the database is restored from a checkpoint image.
The user data may be envisaged as a collection of database table rows and are commonly referred to as tuples. The tuples are pointed to by index entries using direct pointers. During a database checkpoint, the tuples are copied to page-sized memory buffers (“checkpoint buffer”) for a disk write. When an in-memory database engine is started, all data is read from a checkpoint image stored on secondary storage, typically a hard disk. Client requests can be served as soon as the data (the database rows) becomes accessible in the main memory. In an in-memory database engine, user data and the necessary navigation information (indexes) need to be present. If indexes are transient, the rows are re-inserted into the database and, in addition, the indexes are re-created. Alternatively, if the checkpoint includes necessary navigation information within the image, it is possible to read data from the checkpoint so that the server can be opened instantaneously, and user data can be restored based on the client's needs.
The first option is becoming useless due to ever-increasing memory usage and corresponding database sizes. In practical database implementations, the reading of an entire checkpoint image to memory can last several hours. The second solution refers to necessary navigation information. That is, row pointers of indexes are useless because the rows are likely to be positioned in different memory locations when they are restored from the checkpoint. Thus, address translation is needed so that row pointers can be redirected to correct locations of the rows in the checkpoint image. Excluding transient indexes from a checkpoint greatly simplifies, and speeds up, checkpoint creation. The downside is that the possibility of rapidly finding individual rows from a large checkpoint image is impossible.