1. Technical Field
This invention relates to relational data base management systems, and more particularly, to the automatic selection of a data access strategy for the retrieval of data stored within relational data bases.
2. Description of the Prior Art
A data base management system (DBMS) is a computer system for recording and maintaining data. A relational data base management system stores data as rows in tables, and refers to a collection of tables as a data base. Users or application programs manipulate (select, update, insert, or delete) data by issuing requests or commands (called queries) against the data base. In a relational data base management system's data query and manipulation language, such as SQL, requests are nonprocedural (also referred to as nonnavigational). That is, users or application programs simply specify what is wanted, rather than specifying how to accomplish it. The system's optimizer has the responsibility for determining the optimal strategy for obtaining the data.
Relational data base management systems store rows of tables in data pages on physical storage devices such as disk drives. Data is transferred between the physical storage and the computer system's central processing unit (CPU) page by page, even though only a single row may be needed from a given data page. The time it takes to transfer data between physical storage and the processing unit is many times greater than the time it takes to process the data in the processing unit. Furthermore, the time it takes to randomly access separate physical data pages is as much as ten times longer than the time needed to sequentially access adjacent data pages. To manipulate data in a relational data base, the rows must first be transferred from physical storage to the processing unit, then processed in the processing unit, and finally transferred back to physical storage. Because transferring takes so much longer than processing, the total time required to manipulate the data can be dramatically reduced if the number of transfers can be reduced.
The data access strategy chosen by a system optimizer specifies the exact way (or access path) for obtaining and processing the data pages of a table. One way to access data is to sequentially scan every row in a table for those rows which match the search criteria. This is known as a table scan, because the entire table is scanned in sequence, data page by data page, from beginning to end.
Most relational data base systems also maintain indexes for their tables. An index is a list that a data base management system uses to access the rows of a table in a selected order. An index consists of many index entries. Each index entry contains a key value and an identifier of or pointer to the one or more rows of the table that contain the key value. Data base management systems typically store indexes on index pages that are physically separate from the data pages of the corresponding table.
One method of storing an index's pages is as a B-tree. A B-tree consists of a root page, intermediate pages that are dependent on the root page, and index leaf pages at the lowest level of the tree that are dependent on the intermediate pages. The term B-tree is a shortened name for "balanced tree", and refers to the balanced or roughly equal number of index pages to which each such root or intermediate index page points. The B-tree's leaf pages contain the index entries. To scan a table's rows in the order specified by the index, the index's leaf pages are scanned sequentially and the index entries on each leaf page are used to access the rows in the index's order. This scan is called an "index sequential scan".
There are two general types of indexes: clustering and nonclustering. An index is a "clustering index" if an index sequential scan causes only a single access of each data page. For this to occur, the table rows must be stored in the data pages of physical storage in approximately the same sequence as specified by the index. An index sequential scan through a clustering index (also referred to as a clustering index scan) is fast because the number of data page accesses is minimized. There are no duplicate accesses to the same data page, and both the index leaf pages and the data pages can be accessed sequentially rather than at random.
An index is a "nonclustering index" if an index sequential scan causes the system to access the data pages back and forth at random. Index sequential scans through nonclustering indexes (also referred to as nonclustering index scans) are extremely slow, because there is much thrashing back and forth between data pages. This type of index sequential scan requires separate data pages to be randomly accessed and transferred into and out of the processing unit's main memory, but usually only accesses one row from the many rows available on each such data page.
When the search criteria in a query specify the key columns of an index, that index can often provide a more efficient access path than a table scan for identifying and obtaining the data rows that satisfy or match the search criteria. The optimizer determines which of the available access paths is the most efficient based on statistical information about the data and on certain assumptions about the data. The types of statistical information about the data in a table that typically influence an optimizer's choice include the total number of rows in the table, the total number of data pages on which the rows of the table appear, the number of distinct index key values for each index defined on the table, the range of values for the first column of the index key of each index defined on the table, the number of levels in each index defined on the table, and the number of leaf pages in each index defined on the table.
The optimizer estimates the efficiency of the various available access paths using mathematical formulas that require the statistical information as input. The calculations that the mathematical formulas define make certain assumptions about the interpretation of the statistical information. Several formulas, for instance, make an assumption about the distribution of index key values when estimating the selectivity of the search criteria present within a data base query. The accuracy of the optimizer's estimates, and therefore the efficiency of the selected access path, depends critically on the availability of accurate statistical information about the data and on the degree to which optimizer assumptions about the data remain valid. Invalid assumptions about the uniform distribution of index key values can result in drastic performance degradation, including greatly increased CPU utilization and slower response times, in both single table queries and queries that join as few as two tables. The prior art "uniform distribution", "equi-width histogram" and "equi-height histogram" methods discussed below all make invalid assumptions about nonuniformly distributed ("skewed") index key values.
The "uniform distribution method" assumes that index key values are distributed between the lowest existing key value and the highest existing key value with equal frequency. Known current relational DBMS prototypes and products use this method, assuming uniform distributions of index key values and ignoring the possibility of non-uniform distributions.
The "equi-width histogram" approach divides the range of occurring index key values into equal-sized subranges or buckets. Each bucket contains the same number of distinct key values from the range of occurring index key values, but the number of actual occurrences of index key values that correspond to each bucket may vary drastically due to the occurrence of multiple rows with the same key value. The equi-width histogram approach selects access paths to nonuniformly distributed data little, if any, better than the uniform distribution assumption used by current prototypes and products. The equi-width histogram method assumes a uniform distribution of index key values within individual buckets.
The "equi-height histogram" approach, proposed by Piatetsky-Shapiro et al. in "Accurate Estimation of the Number of Tuples Satisfying a Condition", Proc. of ACM-SIGMOD, Boston, MA, 1984, divides the set of actual occurrences of index key values into equal-sized buckets. Each bucket contains the same number of actual occurrences of index key values, but the number of distinct key values that correspond to each bucket varies from bucket to bucket. Like the equi-width histogram method, the equi-height method assumes that actual occurrences of index key values within individual buckets are uniformly distributed. The equi-height histogram approach handles highly skewed data better than the uniform distribution and equi-width histogram approaches, but performs poorly when considering queries including "equal to" ("=") operators.
There is therefore a need for an improved method of selecting an access path in a relational data base management system having nonuniformly distributed index key values.
Lakshmi and Yu, "Access Path Selection in Relational Data Base Systems", IBM Technical Disclosure Bulletin, Feb., 1988, pp. 420-21, propose collecting distribution statistics on all index key values. This indiscriminate statistical collection entails greatly increased processing and computational overhead, resulting in poorer performance. There is therefore also a need for a method of collecting and storing frequency of occurrence statistics for index key values without incurring excessive processing overhead.