Relational and object-relational database management systems store information in tables of rows in a database. To retrieve data, queries that request data are submitted to a database server, which computes the queries and returns the data requested.
Queries submitted to the database server must conform to the syntactical rules of a particular database statement language. One popular database statement language, known as the Structured Query Language (SQL), provides users a variety of ways to specify information to be retrieved. There are many different versions of SQL, some versions are standard and some proprietary, and there are a variety of extensions.
A database statement submitted to a database server is evaluated by a query optimizer. Based on the evaluation, the query optimizer generates an execution plan that defines operations for executing the database statement. Typically, the query optimizer generates an execution plan optimized for efficient execution. Generating an execution plan optimized for execution is referred to herein as query optimization.
In query optimization, a DBMS may evaluate various ways of accessing the database to retrieve data needed to execute a database statement. Ways of accessing the database are referred to herein as access paths. An access path chosen by query optimization may significantly affect the speed of the retrieval and the amount of resources consumed executing the database statement. Many access paths involve a database index to increase the speed of the data retrieval process. Thus, query optimization involves evaluating the use of one or more database indexes for retrieving data needed to execute a database statement.
Query optimization involves evaluating various alternative ways of executing a database statement. Evaluating more alternatives is more costly in terms of computational time and resources. If all or too many alternatives are evaluated for a database statement, then the benefit of generating an efficient database statement would be offset or overcome by the cost of query optimization.
To limit the cost of query optimization, alternatives that are evaluated by query optimization are often limited to a subset of the possible space of alternatives, leaving other alternatives ignored and unevaluated. As a consequence, a suboptimal alternative for executing a database statement may be selected when an ignored alternative was more optimal. As the possible space of alternatives ways of executing a database statement increases, the likelihood of selecting an even lesser suboptimal alternative increases.
The number of alternative ways of executing a database statement increases as the number of database indexes and therefore number of access paths increases. Thus as the number of database indexes increases, the cost of determining an optimal alternative way of executing a database statement and/or the risk of selecting a lesser optimal alternative increase.
The number of database indexes that may be used for a database statement depends on several factors. These include the number of tables accessed by a database statement, and the number of database indexes that have been created for each of the tables accessed by the database statement.
Table and Index Proliferation in N-Tier Architectures
Many DBMS's are used in an N-tier architecture, in which a middleware tier generates and issues queries to a DBMS. As a result of and due to various limitations of such dynamic database statement generation by the middleware, the number of tables referenced in queries issued to DBMS's in N-tier architectures is becoming greater. In addition, the number of database indexes per table has also increased significantly due to the number and type of applications that use any given table in an N-tier architecture. For example, in the Oracle Fusion Applications, database statements dynamically generated by middleware reference 20 to 40 tables, many of which have a large number of indexes. As a result, the number of indexes that need to be considered and evaluated by query optimization has increased manifold.
Described herein are techniques for improving query optimization in the face of proliferation of database indexes.