Databases are used to store information for an innumerable number of applications, including various commercial, industrial, technical, scientific and educational applications. As the reliance on information increases, both the volume of information stored in most databases, as well as the number of users wishing to access that information, likewise increases. Moreover, as the volume of information in a database, and the number of users wishing to access the database, increases, the amount of computing resources required to manage such a database increases as well.
Database management systems (DBMS's), which are the computer programs that are used to access the information stored in databases, therefore often require tremendous resources to handle the heavy workloads placed on such systems. As such, significant resources have been devoted to increasing the performance of database management systems with respect to processing searches, or queries, to databases.
Improvements to both computer hardware and software have improved the capacities of conventional database management systems. For example, in the hardware realm, increases in microprocessor performance, coupled with improved memory management systems, have improved the number of queries that a particular microprocessor can perform in a given unit of time. Furthermore, the use of multiple microprocessors and/or multiple networked computers has further increased the capacities of many database management systems. From a software standpoint, the use of relational databases, which organize information into formally-defined tables consisting of rows and columns, and which are typically accessed using a standardized language such as Structured Query Language (SQL), has substantially improved processing efficiency, as well as substantially simplified the creation, organization, and extension of information within a database.
Furthermore, significant development efforts have been directed toward query “optimization,” whereby the execution of particular searches, or queries, is optimized in an automated manner to minimize the amount of resources required to execute each query. A query optimizer typically generates, for each submitted query, an access plan. The access plan may include the use of Look Ahead Predicate Generation (LPG). LPG is a technology used in the iSeries DB2 from International Business Machines Corporation whereby local selection is generated on one table by obtaining the values of columns it joins to on other tables. Typically, the optimizer determines whether LPG should be used or not before it starts to fetch rows for the query. Once the optimizer determines that LPG will be used, the predicates are generated or built, and the entire query is typically processed with the predicates. LPG is typically used on clearly complex queries, since the costs (e.g., time and resources) of processing a complex query without predicates are higher than the costs associated with generating the predicates and then processing the complex query using the predicates. The performance of long running complex join queries, for example, is greatly enhanced through the use of LPG. Contrarily, LPG is not used on clearly simple queries, since the costs of processing a simple query without predicates are lower than the costs associated with generating the predicates and processing the simple query using the predicates.
Typically, however, for the majority of queries received by the optimizer, it is not precisely clear whether or not the query would benefit from the use of LPG. The lack of clarity by the optimizer may be due to files being dynamically updated as the query is simultaneously processing, statistical imprecision during optimization, or contention for system resources. The lack of clarity may lead to poor decision making by the optimizer and a decline in performance as queries that would benefit from LPG, such as a query that becomes complex during processing due to a large dynamic upload, may be processed without LPG. Moreover, the lack of clarity may lead the optimizer to needlessly decide to process a query using LPG even though the query may not benefit from LPG, thus, increasing the query's processing time instead of shortening the processing time.
One cause of such poor decision making is due to the fact that the optimizer typically makes its determination of whether or not to use LPG before initiating processing of the query. Therefore, any additional factors that may arise after processing of the query begins are never considered when deciding whether or not LPG should be used.
A need therefore exists in the art for improving the performance of database queries, and in particular, for a more flexible and intelligent approach for utilizing LPG in connection with processing database queries.