Embodiments of the invention relate to using parallel insert sub-ranges to insert into a column store.
A database may store data in a table that consist of rows (also referred to as tuples or records) and columns (also referred to as fields or attributes) of data. The database may be accessed using Structured Query Language (SQL) interface. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American National Standards Institute (ANSI) and the International Standards Organization (ISO).
A table in a database may be accessed using an index. An index is an ordered set of references (e.g., pointers) to the records in the table. The index is used to access each record in the table using a key (i.e., one of the fields or attributes of the record, which corresponds to a column). The term “key” may also be referred to as “index key”. Without an index, finding a record requires a scan (e.g., linearly) of an entire table. Indexes provide an alternate technique to accessing data in a table. Users can create indexes on a table after the table is built. An index is based on one or more columns of the table. The data in the table may be stored in pages of physical storage.
A query may be described as a request for information from a database based on specific conditions. A query typically includes one or more predicates. A predicate may be described as an element of a search condition that expresses or implies a comparison operation (e.g., A=3).
The database may be column store oriented or row store oriented. Both have INSERT, UPDATE and DELETE (IUD) functionality. From the viewpoint of an external user or application, complete rows (possibly omitting columns that have defaults) are inserted, regardless of whether the database is row store oriented or column store oriented. Internally, in a row store oriented database the entire row is stored on a page, while, in a column store oriented database, each column value is stored in one of the pages that belongs to that column (while inserting a complete row in a single insert operation).
Tables that are column store oriented may be referred to as “columnar tables”. Some conventional systems insert into columnar tables using staging tables. A staging table may be row-based, supporting concurrent insert to the same pages through the usual locking/latching. Some conventional systems disallow updates to new rows. Some conventional systems do not compress data until merge time. Also, with staging tables, insert, update, and delete operations require updating two tables (a staging table and a target table), and queries require reading two tables (the staging table and the target table).