1. Technical Field
Present invention embodiments relate to index scans within databases, and more specifically, to efficiently accommodating gaps in the index scans.
2. Discussion of the Related Art
Database systems utilize indexes to enhance retrieval of data. Basically, an index produces a data set including and ordered by designated columns of a database table in order to enable rapid retrieval of data and avoid scanning substantial portions or the entirety of a database table. The index data set includes rows of the designated columns each with pointers to the corresponding rows in the database table containing the desired data. For example, a table, T1, with columns c1, c2, c3, c4 and c5 may include a composite index (e.g., an index including two or more columns of a database table), I1 for columns c1, c2, c3, and c4 of table T1. The index provides a data set with rows sorted by the values of columns c1, c2, c3, and c4 in this specific order. In other words, when a column includes two or more of the same values, the corresponding rows are sorted in the index data set by the value in a subsequent column of the index (e.g., when two or more of the same values reside in column c1, the corresponding rows are sorted by the value of column c2).
However, certain queries may present processing inefficiencies for a composite index. For example, a query for table T1 may include the following.
SELECT *
FROM T1
WHERE c2=20
This query retrieves rows from table T1 where the values of column c2 are 20. In this case, since column c1 is not constrained, start and stop keys cannot be used, thereby requiring all of the data in the index data set to be scanned to identify values of 20 within column c2. Column c1 provides a gap in the index for this example.
As a further example, a query for table T1 may include the following.
SELECT *
FROM T1
WHERE c1 BETWEEN 1 AND 10
AND c2=100
This query retrieves rows from table T1 where the values of column c1 are between 1 and 10, and the values of column c2 are 100. A query optimizer of the database system can generate an index scan (e.g., scan of the data set corresponding to the index) with a start key (for column c1) of 1 and a stop key (for column c1) of 10 to apply the first predicate (c1 BETWEEN 1 AND 10) on column c1. In order to satisfy the query, the second predicate (c2=100) on column c2 needs to be applied to the matching keys returned by the index for column c1. The second predicate (c2=100) can be applied as an index predicate, but cannot be added to the start/stop keys for column c1 since b-tree index technology employed by a typical relational database system does not allow a range to be specified for a leading column (e.g., column c1). Column c1 is known as a gap in the index.