Computers are powerful tools for storing and accessing vast amounts of information.
Computer databases are a common mechanism for storing information on computer systems. A typical database is organized collections of related information stored as “records” having “fields” of information. As an example, a database of sales may have a record for each sale where each record contains fields designating specifics about the sale, such as identifier, price, shipping address, order date, ship date, etc. An organized collection of related information in a database is sometimes referred to as a table having rows and columns. The rows of a table correspond to records and the columns of the table corresponds to fields.
Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software intermediary or layer. Typically, all requests from users to access database data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth.
A fundamental challenge in designing any DBMS is to provide the ability to users to quickly select a small subset of a large volume of database data. For example, a manager of a chain of retail stores may be interested in selecting information about sales that occurred on a particular date in a particular one of the stores from among historical sales data collected from all of the retail stores over the past five years. Typically, approaches for improving the performance of highly selective database queries include adding additional indexes on selected tables and/or partitioning selected tables.
Partitioning is the ability of a DBMS to decompose a very large table and associated indexes into smaller and more manageable pieces called partitions. A column or group of columns may be used to determine the partition in which a particular row of data is stored. The column or the group of columns used for this purpose is sometimes called the partitioning key.
More recently, approaches for improving the performance of highly selective database queries include clustering and using zone maps. Clustering refers to storing related data of a table in a sorted order in contiguous on-disk data blocks. A zone map is then added to index the clustered data as stored on-disk. Specifically, the zone map divides the clustered data into contiguous on-disk “regions” or “zones” of contiguous data blocks.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.