A common technique for storing a database in a data storage system is to assign database tables to a tablespace. A tablespace is a named collection of one or more datasets. Each tablespace is divided into equal units called data pages, each data page contains one or more tuples of data. A tuple typically consists of one record (e.g., one row) in a database table of the database.
Database systems are susceptible to data loss after a system failure. To prevent data loss, database systems usually copy the database from a volatile storage device to a non-volatile storage device. Modifications to data in pages of a database are recorded in a database log file (or database log) for performing recovery operations. A database log file contains a list of time-ordered actions that indicate what modifications were made to the database, and in what order those changes were made. The database log file is also stored in the non-volatile storage device for data durability.
Database recovery after a system failure generally involves reading the log file and applying log records from the log file to the appropriate page in the database in the order in which the log records were written into the log file. The page is read from an input/output (“I/O”) buffer. The required modifications indicated in the log file are applied to the page. The page is then written back into the I/O buffer for subsequent storage in the database. This process is repeated until all the log records are applied to the database and the database is restored to its previous state before the system failure.
Many database systems implement partitioning to improve performance. Partitioning a database log file into distinct parts can improve performance and availability of data. Also log partitioning can be used to enable multiple machines (or multiple process threads in a single machine) to write transaction log records to the database log in parallel. Such database systems implement a “last log buffer page” for appending new log records to the log. Conventionally, the last log buffer page is shared among all of the partitions and is configured to sort the log records as they are received from multiple sources. The last log buffer page can then record the log records in the sorted order.
Because the last log buffer page of conventional systems is shared across multiple systems or process threads, such systems do not scale well for extreme online transaction processing (“OLTP”). For OLTP conventional logging systems experience a bottleneck at the shared last log buffer page as it is accessed across the multiple machines or processes.