Computers are powerful tools for storing and accessing vast amounts of information. Computer databases are a common mechanism for storing information on computer systems. A typical database is an organized collection of related information stored as “records” having “fields” of information. As an example, a database of sales may have a record for each sale where each record contains fields designating specifics about the sale, such as identifier, price, shipping address, order date, ship date, etc.
Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software intermediary or layer. Typically, all requests from users to access database data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth.
A fundamental challenge in designing any DBMS is to provide the ability to users to quickly select a small subset of a large volume of database data. For example, a manager of a chain of retail stores may be interested in selecting information about sales that occurred on a particular date in a particular one of the stores from among historical sales data collected from all of the retail stores over the past five years. Typically, approaches for improving the performance of highly selective database queries include adding additional indexes on selected fields.
A database index allows the records to be organized in many different ways, depending on a particular user's needs. An index key value is a data quantity composed of one or more fields from a record which are used to arrange (logically) the database file records by some desired order (index expression). Here, the column or columns on which an index is created form the key for that index. An index may be constructed as a single disk file storing index key values together with unique record numbers (e.g., RIDs). The record numbers are unique addresses of (pointers to) the actual storage location of each record in the database file.
More recently, approaches for improving the performance of highly selective database queries include clustering and using zone maps. Clustering refers to storing related data of a table in a sorted order in contiguous on-disk data blocks. A zone maps is then added to index the clustered data as stored on-disk. Specifically, the zone map divides the clustered data into contiguous on-disk “regions” or “zones” of contiguous disk blocks.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.