This invention relates to the fields of computer systems and data processing. More particularly, a system and methods are provided for predicting the performance of a cache, as measured by a suitable metric such as the cache's miss rate.
When a cache is used to speed users' access to shared data, the amount of available memory is often the determining factor in selecting the cache size. System managers or database administrators usually do not have the tools to tune or resize the cache to the demands of a typical workload. Thus, memory may be wasted if the cache is too large, or excessive input/output operations may be incurred if the cache is too small.
To determine whether a cache is allocated sufficient memory, different cache sizes may be tested under actual operating conditions in a lengthy trial and error process. Caches of different sizes may be implemented and their performances noted for comparison. Suitable measures of performance may include a cache's hit rate or miss rate, which indicate how frequently the cache is found to include or omit, respectively, referenced data items. Prolonged or repeated testing, however, may adversely affect operation of the data processing system, as some cache implementations will be too generous and waste memory on the cache, while others will be too stingy and result in an undesirable level of input/output operations.
Alternatively, predictions may be generated for a particular cache size, based on the performance of a baseline cache for a given workload. For example, based on a number or ratio of hits or misses in a baseline cache of a given size, it may be expected that a proportionally adjusted number of hits/misses may result from a particular increase or decrease in size. Such extrapolations may, however, be very inaccurate.
Simulations of caches of different sizes may be performed, although typically each simulation can only simulate the performance of one cache size. Although simulations sometimes offer more accurate predictions, they can add a great deal of overhead to the operation of a data processing system, which may be an unacceptable trade-off. To predict the operation of multiple caches (e.g., of different sizes), multiple simulations may be required, thereby increasing the impact on an operational environment.
Yet further, even if a more suitable or optimal cache size is identified through an existing method of prediction, data processing operations (e.g., a database management system) may need to be halted and restarted in order to implement the different size.