Databases are used to store information for an innumerable number of applications, including various commercial, industrial, technical, scientific and educational applications. Many databases are relational databases, which organize information into formally-defined tables consisting of rows (i.e., records) and columns (i.e., fields), and which are typically accessed using a standardized language such as Structured Query Language (SQL). Database management systems (DBMS's) are the computer programs that are used to access the information stored in the databases and to process searches, or queries, against the databases.
In general, a database query references one or more database tables in a database and includes one or more predicates. Each predicate includes an expression that references a field of a database table and a key value of the database table, and an operator (e.g., equal, etc.). Operators (e.g., AND, etc.) may also be applied to multiple predicates. To execute the query, many DBMS's perform query optimization, in which multiple execution plans or access plans for satisfying the database query are examined to determine the most efficient way to execute the query.
One type of optimization that may be utilized in an access plan includes the use of a database index. A database index is usually built over one or more fields of the database table, and in many cases includes sufficient information about which particular records in a database table likely match a particular predicate without having to retrieve and scan all of the individual records of the database table. One type of index that may be utilized is an encoded vector index (“EVI”). An EVI is a data structure that is made up of two primary components: a symbol table and a vector table. The symbol table contains the distinct key values in the rows of a table covered, as well as statistical information about each key. The statistical information typically includes a numeric byte code identifying the key, the first and last rows of the table where the key is found (i.e., the relative record number (RRN)), and the number of times the key appears in the table (i.e., count). The vector table contains a list of byte codes indicating which key is contained in each row, and as the byte codes are generally in the same ordinal position as the rows in the table, the vector table corresponds to the actual rows in the table. Additionally, the byte codes are often utilized to dynamically create a bitmap index when the database query is run or executed against the table.
Another type of index that may be utilized is a radix index or radix tree index. Radix indexes, like bitmap indexes, typically utilize binary (i.e., 1's and 0's) to indicate the presence of a given key value, with a “1” indicating that the key value is present and a “0” indicating that it is not present. In some systems, radix indexes often have a field, usually a hidden key field as the last key field in the index, that contains the relative record numbers corresponding to the actual rows in the table.
Just as an index in a book facilitates locating information on a specific topic quickly and without blindly paging through the book, database indexes provide similar benefits by providing a method to quickly locate data of interest in a database. In particular, the shorter entries of an index may be quickly searched for data (e.g., a key value) that satisfies the database query, and then the corresponding locations in the database table may be searched to retrieve the results. Thus, only a subset of the database table is searched. Without an index, a DBMS performs a full table scan of the database table, blindly searching through every row in the database table until the target data is located. Depending upon where the data resides in the database table, such a table scan can be a lengthy and inefficient process.
To perform an indexed scan of a database table, techniques such as key selection and key positioning are often utilized on an index. For example, with a radix index, key selection (or a radix index scan) is an operation similar to a table scan in which all of the entries in the index are sequentially processed to identify the key values specified in the database query. Afterwards, the subset of entries from the index that contain the key value may be identified and the corresponding rows in the database table may be searched to retrieve the results for the query. As such, for a given key value, the decision to select or omit an entry from the index to search in the database table is based upon the key value sought.
With key positioning (or radix index probe), only a portion of the index is searched based upon the criteria in the database query. In particular, the structure of the radix index and its collating (i.e., sorted) sequence may be directly probed to find the starting and stopping values of a range of a key value specified in the database query. For example, as the radix index has a sorted collating sequence, if the query references an “x” key value, the range of x's in the index may be probed directly without having to scan the entire radix index as with key selection. The subset of entries from the radix index that contain the given key value may be searched in the database table to retrieve the results for the query.
Indexes have traditionally been utilized individually. For example, during typical query optimization, an index built over a field or an index built over multiple fields may be selected for an execution plan. After selection of the index, key selection or key positioning may be applied to the selected index and the database query may be executed to retrieve the results. No information is typically shared between two or more indexes, except for maybe estimate information or RRN information.
However, as the reliance on information increases, both the volume of information stored in most databases, as well as the number of users wishing to access that information, likewise increases. Moreover, as the volume of information in a database, and the number of users wishing to access the database increases, the amount of computing resources required to manage such a database increases as well. Thus, new ways to use database indexes, which often require many system resources to build, are needed in order to continue to provide significant improvements in query performance; otherwise, database users will be hampered in their ability to maximize intelligent information retrieval.