Databases are used to store information for an innumerable number of applications, including various commercial, industrial, technical, scientific and educational applications. As the reliance on information increases, both the volume of information stored in most databases, as well as the number of users wishing to access that information, likewise increases. Moreover, as the volume of information in a database, and the number of users wishing to access the database, increases, the amount of computing resources required to manage such a database increases as well.
Database management systems (DBMS's), which are the computer programs that are used to access the information stored in databases, therefore often require tremendous resources to handle the heavy workloads placed on such systems. As such, significant resources have been devoted to increasing the performance of database management systems with respect to processing searches, or queries, to databases.
Improvements to both computer hardware and software have improved the capacities of conventional database management systems. For example, in the hardware realm, increases in microprocessor performance, coupled with improved memory management systems, have improved the number of queries that a particular microprocessor can perform in a given unit of time. Furthermore, the use of multiple microprocessors and/or multiple networked computers has further increased the capacities of many database management systems.
From a software standpoint, the use of relational databases, which organize information into formally-defined tables consisting of rows and columns, and which are typically accessed using a standardized language such as Structured Query Language (SQL), has substantially improved processing efficiency, as well as substantially simplified the creation, organization, and extension of information within a database. Furthermore, significant development efforts have been directed toward query “optimization”, whereby the execution of particular searches, or queries, is optimized in an automated manner to minimize the amount of resources required to execute each query. In addition, a reduced reliance on runtime interpretation of queries in favor of increased usage of directly-executable program code has improved query engine performance.
Through the incorporation of various hardware and software improvements, many high performance database management systems are able to handle hundreds or even thousands of queries each second, even on databases containing millions or billions of records. However, further increases in information volume and workload are inevitable, so continued advancements in database management systems are still required.
In many object-oriented database designs, for example, various active objects used by clients of a database may be shared and/or reused to reduce the overhead associated with individually creating those active objects each time they are needed.
Connection pooling, for example, refers to the reuse and sharing of “connections” to a database. When a connection is requested, a connection object, providing various methods used to interface with a database, is created. Creation of a connection object often requires authentication with the database as well as an allocation of resources in the database to handle the connection, both of which may consume a substantial amount of time and resources in the database.
With connection pooling, created connection objects are maintained in a “pool”, so that, whenever a client wishes to open a connection, an existing connection object may be used in lieu of creating a new connection object, thus saving the overhead associated with creating the new connection object. In many environments it is desirable for clients to maintain an active connection with a database only when issuing requests on the database or performing updates to the database, so in such environments it is common for such clients to need to reconnect with a database on a relatively frequent basis. As a result, in such environments, connection pooling can provide substantial gains in database performance.
Related to the concept of connection pooling is statement pooling, which reuses prepared statements utilized in concert with requests made upon a database. Specifically, many database designs incorporate a prepare/execute data access model to process SQL queries. During a “prepare” operation, an access plan (often referred to as a prepared statement), is prepared for an SQL statement by the database. Then, during an “execute” operation, that access plan is executed with specific parameters to perform the desired query.
Generating a prepared statement for an SQL query often occupies much of the overhead associated with processing the query. Thus, by separating the preparation and execution of a query into separate operations, a prepared statement can be stored in a pool and reused by multiple queries, with later queries having basically the same SQL statement format as the original query merely providing new parameters to the prepared statement. Overall performance is therefore improved due to the elimination of many of the statement preparation operations that would otherwise be required to execute individual queries.
The benefits of statement pooling to both client throughput and database scalability can be substantial; however, one drawback of many statement pooling schemes results from the often non-optimal configuration of the various prepared statements in a statement pool. In many instances, prepared statements are generated from SQL statements initiated by automated tools that may not be capable of generating an optimal SQL statement format. Particularly given that the optimal configuration of an SQL query can vary from platform to platform, and from database to database, based upon different performance characteristic of the underlying platform and/or database, generic tools that generate SQL statements suitable for multiple platforms and/or databases likely are incapable of generating the optimal SQL statement for a particular situation. Moreover, even when SQL statements are initially created by application programmers, attempting to support multiple platforms and/or databases in a given application can often clutter up application code, and hamper the overall application development process. In still other instances, application programmers may simply not be aware of the optimal SQL statement format for a particular situation. As a result, the prepared statements resident in a statement pool often fall short of being optimally configured to efficiently implement a particular query on a particular platform and/or database.
Automated optimization of database queries has in general been performed at various stages in the life of a query, including during generation of prepared statements from SQL statements by a query optimizer, and during execution of prepared statements by a query engine. Irrespective of these various optimizations, however, it has been found that many prepared statements in a statement pool are often non-optimal in nature, simply based upon the fact that the optimizations may be performed on initially non-optimal SQL statements issued by an application program. As a result, the inefficiencies embodied in these non-optimal pooled statements are compounded as those statements are repeatedly reused.
An additional concern in many systems results from the resource-intensive nature of various types of optimizations. A goal of any optimization is for the additional overhead associated with performing an optimization to not exceed the overhead savings obtained by the use of an optimized representation of a query instead of an unoptimized representation of the same. Thus, in many instances, particularly where it is not known that particular queries will eventually be reused, optimization of such queries may not always be prudent.
Therefore, a significant need continues to exist for a manner of optimizing queries so as to maximize the relative benefits of optimization over the additional costs associated with the implementation of such optimizations.