In a conventional row-based database, each row (i.e., record) of a database table is stored contiguously in memory. Accordingly, if a new record is added to a table, the values of the new record may be appended to the values of the existing records of the table.
In contrast, a columnar database stores values per table column. FIG. 1 shows table 10, including three columns, and memory locations 20 in which the values of table 10 are stored. Memory locations 20 may represent volatile and/or persisted memory.
The values of column Name are stored in locations beginning with memory location A, the values of column Address are stored in locations beginning with memory location B, and the values of column Telephone are stored in locations beginning with memory location C. More specifically, the values of the first record of table 10 are stored at memory locations A, B and C. Similarly, the values of the fourth record of table 10 are stored at memory locations A+4, B+4 and C+4.
If a new record is added to a columnar table, its values are not immediately appended to the memory locations of their respective columns, because such locations may be occupied by values of other columns. Instead, the values of the new record are appended to a delta structure, which stores changes on the table. Once the delta structure reaches a particular size, the data in the delta structure is merged with the actual columnar data of the table (e.g., by adding new values of new records, deleting values of deleted records, and updating values of updated records). This merge results in overwriting entire columns, and the delta structure is thereafter empty.
The foregoing process occurs in volatile memory (e.g., Random Access Memory) and in persisted memory (e.g., disk). That is, each of volatile memory and persisted memory include the actual columnar data and a delta structure which is updated on each transaction. During a merge, the actual columnar data of the volatile memory is merged with the delta structure of the volatile memory and the actual columnar data of the persisted memory is merged with the delta structure of the persisted memory.
Therefore, each merge requires many I/O operations in order to create a new columnar table in persisted memory (i.e., because all table data, not only the changed data, must be processed). These operations negatively affect performance. Moreover, in order to guarantee recoverability of the columnar data, the entire new columnar table must be written to the persisted database log, thereby creating a large log for which a backup is not practical.