Database management systems (DBMS's) store extensive amounts of organized information for access by users of the systems. In a database system environment, the system responds to specific queries from the user as to which data from the database to examine, retrieve and/or present. Typically, users provide queries in a particular syntax of a particular database language that is used by the system, such as Structured Query Language (SQL) format.
DBMS's utilize computer main memory for a variety of tasks. Important tasks include sorting and caching, including caching data pages (i.e., the tables of the database), as well as caching queries. Queries are stored in a query cache to speed up query processing. A query cache (e.g., “package cache” in the DB2 product from IBM Corporation) typically stores the SQL text for each query issued to the DBMS, as well as an executable form of the query which is the result of compiling the query. When a query is issued, the query cache is searched to determine if the DBMS has the executable form of the query already stored in the cache. If the executable form is already present, this indicates that the same query was previously received by the DBMS and compiled into its executable form at that time. Thus, the expensive and time-intensive process of compiling the query is avoided at later times if the query is already available in the query cache. The purpose of the query cache is to avoid incurring this cost as much as possible.
The storage capacity (size) of the query cache determines how many queries can be stored and thus determines how likely a query will be found in the cache. When the size of the query cache is exceeded, older queries are typically removed from the cache to make room for newer queries. In general, the smaller the query cache, the greater the amount of cache “misses” will occur, in which the query is not found, and compiling of the query must therefore be performed. However, if the size of the query cache is too large, then valuable memory storage would be wastefully allocated when a smaller cache size would be adequate, and when a different memory area of the system may be in greater need of the memory storage.
Thus, setting a proper cache size is an important task in a DBMS. In prior systems, setting the size of the query cache was a manual task forced upon a database administrator. Typically, the administrator would set the query cache one time initially and not make frequent updates. For example, there are implementations that set the query cache size based on a program asking the user initially a set of questions and using a heuristic based on the answers to the questions to set the query cache size. A problem with such solutions is that they do not adapt to changes occurring in the database system, and they are based on equations which are only rough approximations of query size and performance of the system. Other solutions are likewise inflexible, requiring that simulations be run at particular pre-designated cache sizes to find a more optimal cache size for the system.
Accordingly, what is needed is the ability to provide accurate estimations as to how much processing time is saved and how much processing time it will cost for tuning a query cache to any desired size, and at any time over the actual operation of a database system, to assist the tuning of the size of that query cache. The present invention addresses such a need.