1. Technical Field
This invention generally relates to computer systems, and more specifically relates to apparatus and methods for optimizing a database query.
2. Background Art
Since the dawn of the computer age, computers have evolved and become more and more powerful. In our present day, computers have become indispensable in many fields of human endeavor including engineering design, machine and process control, information storage and retrieval, and office computing. One of the primary uses of computers is for information storage and retrieval.
Database systems have been developed that allow a computer to store a large amount of information in a way that allows a user to search for and retrieve specific information in the database. For example, an insurance company may have a database that includes all of its policy holders and their current account information, including payment history, premium amount, policy number, policy type, exclusions to coverage, etc. A database system allows the insurance company to retrieve the account information for a single policy holder among the thousands and perhaps millions of policy holders in its database.
Retrieval of information from a database is typically done using queries. A query usually specifies conditions that apply to one or more columns of the database, and may specify relatively complex logical operations on multiple columns. The database is searched for records that satisfy the query, and those records are returned as the query result.
Queries that are frequently run may be optimized to enhance system performance. In order to optimize a query, a query optimizer must be able to estimate the number of rows that the query will return. This estimate may then be used by the query optimizer to determine how to most efficiently execute the query. When an optimal index for the query exists, the optimal index may be probed using the keys that correspond to columns in the query to determine the number of rows that satisfy the query. When an optimal index for the query does not exist, but a sub-optimal index exists, the sub-optimal index may be probed. However, this sub-optimal index includes one or more additional keys that prevent traversing the sub-optimal index. The query optimizer may go ahead and use the number of rows returned as an estimate for rows that satisfy the query, and optimize the query accordingly. However, this estimate is flawed because of the one or more additional keys that prevent traversing the index. As a result, the index cannot be probed below the additional key. Thus, known query optimizers cannot make accurate optimizations to a query when only a sub-optimal index is available. Without a way to use a sub-optimal index to more accurately estimate the number of rows the query will return, the computer industry will continue to suffer from query optimizers that cannot accurately optimize queries when an optimal index is not available.