The present invention relates generally to the field of columnar database management, and more particularly to handling queries during decomposed updates.
A columnar (column) database is a particular type of database in which a database management system (DBMS) arranges the information of a two-dimensional table so that all of the values of a given column are stored together. The database management system stores the values for each column in order, then stores the values of the next column in order, etc. Since improved compression rates are one of the main motivations for storing column values together, columnar databases typically store the values in a given column in a highly compressed manner (i.e., with little or no free space between column values). Rather than store record identifiers (RIDs) with each column value, column databases typically store column values for a given row in the same ordinal location for each column. For example, the fifth value in column 1 is part of the same row as the fifth value in column 2, etc.
Because of the highly compressed data and the need to maintain consistency in ordinal location for each row, typical column databases employ a decomposed update method to update row values. In a decomposed update, the DBMS deletes the current value for a row in each column, and then inserts a new, updated version (for column values which have changed) of each value in the same ordinal location of each column. During a decomposed update, individual rows can be locked while they are being modified. For example, a write lock can be put in place on a particular row in order to prevent users from reading a particular record while it is being updated by another user. Typical database management systems can perform lock tests to determine at a given point in time whether a particular record is locked for editing by another user.
Database administrators and other persons or machines (users) accessing the information in the database manipulate the data using particular commands, called queries, to the database. Common queries include “SELECT” to return a row, “INSERT” to insert a record, “UPDATE”, to change a record, and “DELETE”, to remove a record. When a user submits a query to return records with specific attributes, the DBMS returns records based on the nature of the query and the status of the record requested by the user. Another factor which affects the rows that can be viewed at a particular time is the isolation level of the request. In general, an isolation level determines what information in the database is visible to other users carrying out concurrent operations on the database. The lowest isolation level (i.e., provides the least consistent view of the data) is “uncommitted read” (UR), which allows users performing concurrent operations on the database to perform “dirty reads” (i.e., users can view changes that have not yet been committed to the database by other users). Another common isolation level in column databases is the “cursor stability” (CS) level. In a CS isolation level, a given row is visible if it is committed when it is examined by the query. This does not preclude the row from being invalidated at some later point in the life of the query. As a result, CS queries will provide a more consistent view of the data than UR queries, but the view is not fully consistent.