A database is a collection of stored data that is logically related and that is accessible by one or more users or applications. A popular type of database is the relational database management system (RDBMS), which includes relational tables, also referred to as relations, made up of rows and columns (also referred to as tuples and attributes). Each row represents an occurrence of an entity defined by a table, with an entity being a person, place, thing, or other object about which the table contains information.
One of the goals of a database management system is to optimize the performance of queries for access and manipulation of data stored in the database. Given a target environment, an optimal query plan is selected, with the optimal query plan being the one with the lowest cost (e.g., response time) as determined by an optimizer. The response time is the amount of time it takes to complete the execution of a query on a given system.
In some cases, tables in a relational database system may contain a very large amount of data. For example, large retail chains may operate relational databases that contain daily sales figures. The tables of daily sales figures may include millions or billions of rows and a large number of columns. There are a number of different ways to store tables. As examples, tables may be row-stored or column-partitioned. As other examples, tables may be row-partitioned or multi-level partitioned.
There are a number of known advantages for column-partitioned tables versus rows-stored tables. One advantage is less input/output due to the fact that only the columns that need to be accessed in a query are loaded from disk. Another advantage is that compression can work very well for certain columns, particularly for pre-sorted columns. However, there are also a number of known disadvantages of column-partitioned tables. One disadvantage is that the table is mostly well-suited for append operations, not for insert, delete, and update operations. Another disadvantage is that queries involving a lot of columns can actually cause performance degradation. A better access path is important because scanning all rows and/or columns in a large table is time-consuming and may impose an unacceptable load on computing resources.