1. Technical Field
The present invention relates to a method and means for managing a computer system cache and, in particular, to cache replacement algorithms.
2. Description of the Related Art
Caches are typically high speed memories in modem computer systems which temporarily hold portions of the contents of larger slower storage mediums. Integral to the management of cache systems is the replacement algorithm, which selects the portion of the cache contents to be evicted when space is needed for new contents. Improvements in the management of cache contents, in selecting those portions data to be evicted, can result in significant improvements in overall system performance.
Cache memories are commonly utilized to provide information to processors at a faster speed than that which is available from main memory. With the advent of distributed processing, cache systems are also used to improve the speed of data access to servers in a networked data communications system. An important example is caching by web-proxy. Web-proxies are positioned between the web browser and an originating server. Instead of fetching an object, for example a document, directly from the server, the browser first contacts the proxy. If the proxy has a valid copy of the object in its cache, it sends this copy to the browser. Otherwise, the proxy contacts the originating server directly to get a copy of the requested object. Then, the proxy sends this copy to the browser and if the document is cacheable, it stores the document in its cache. If an object is fetched from the cache, this is called a xe2x80x9chitxe2x80x9d, otherwise it is called a xe2x80x9cmissxe2x80x9d requiring a xe2x80x9cfetchxe2x80x9d to load that object from the server in to the proxy""s cache.
A cache is typically full during normal operation, and a cache miss requires not only a fetch but also a replacement where one or more other objects must be removed or evicted from the cache and the fetched object is loaded into cache. Prior art replacement methods include usage-based methods which take into account the history of every object""s use when determining which objects to evict. Examples of this type of replacement method are the xe2x80x9cleast recently usedxe2x80x9d (LRU) approach and the xe2x80x9cworking setxe2x80x9d approach. Non-usage-based approaches select objects for eviction on some basis other than usage. The xe2x80x9cfirst-in-first-outxe2x80x9d (FIFO) approach and the xe2x80x9crandomxe2x80x9d approach are examples of this class.
Similar methods are utilized for cache systems exploited for use in a web-proxy. When the cache is full, the proxy has to remove old objects, also called documents, in order to free space for newer documents. Within this art, cache replacement methods are also called garbage collection. Web-proxies typically store a vast amount of documents and it is not possible to store and manage a data structure while keeping the documents sorted according to the probability of a future access. For this reason garbage collection may be performed either at a certain time of the day or as triggered whenever the cache size exceeds a given limit. For determining which documents are to be evicted, a weight is assigned to each document in the cache and documents are selected for eviction based on their weight in relation to the weight assigned to the other documents. The weight is an estimation of the relative significance of a particular object. In other words, the weight estimates the relative probability and number of instances that a particular document will be accessed in the future, as compared to other documents stored in cache.
The fundamental goal of garbage collection is to keep documents which have a high probability of being accessed often in the future and to remove documents that are least likely to be accessed. The accuracy of the assigned weights, as a relative measure of the probability that an object or document will be accessed often in the future, is essential to effective garbage collection. Small improvements in a method for determining such weights can result in significant improvements in the overall hit-rate of the cache memory, and significant improvements in the overall performance of the networked data communications system.
It is a purpose of the method and system of the present invention to provide a self-adapting algorithm for selecting objects to be evicted from a cache where the algorithm is adjusted based on actual performance.
It is another purpose of the invention for the algorithm to be automatically and continuously adjusted to adapt to varying cache usage patterns.
The forgoing purposes are achieved using the following method. A cache is provided that is adapted to store objects which can be selectively evicted to make space for new objects. An algorithm is provided for determining a weight to be associated with each object stored in the cache. The weights are determined based on a first attribute of the objects and an associated first control parameter, which determines the significance of the first attribute to the overall object weights. The weights are then used to select the objects to be evicted from the cache. The first control parameter is set to an initial value. The hit rate is observed during a first time interval. The first control parameter is then adjusted in a first direction by a first incremental amount. The hit rate is then observed during a second time interval. The first control parameter is then adjusted by a second incremental amount, based on the hit rate during the first time interval and the second time interval.
Optionally, when the hit rate is improved during the second time interval, the direction of the second incremental adjustment is in the same direction as the first incremental adjustment. Otherwise, when the hit rate is not improved during the second time interval, the direction of the second incremental adjustment is in the opposite direction as that of the first incremental adjustment.
In accordance with another embodiment of the present invention, when the hit rate is improved during the second interval, the magnitude of the second incremental adjustment is chosen to be larger than when the hit rate is reduced in the second interval.
Yet another embodiment provides that when the hit rate is improved during the second interval, the magnitude of the second incremental adjustment is larger then the first incremental adjustment. Additionally, when the hit rate is reduced in the second interval, the magnitude of the second incremental adjustment is less than the magnitude of the first incremental adjustment.
The method of the present invention may also include an algorithm for determining weights that is based on a plurality of different attributes of the cache objects where each attribute has a corresponding control parameter which determines the relative significant of each attribute in determining object weights. The value of each control parameter is determined by repeatedly adjusting one parameter at a time and observing the effects of those adjustments on the actual hit rate, and then setting the parameter accordingly to maximize the hit rate. Optionally, more then one parameter may be adjusted at the same time.
In one embodiment, the object attributes considered by the weight algorithm may be the size of each object and the time since the object was last used.
The method of the present invention for adjusting control parameters and observing the hit rate include repeating the control parameter adjustments and hit rate observation steps continuously, using time intervals from several hours to several days.
A preferred embodiment of a system utilizing the methods of the present invention is a web-proxy. Other embodiment of the present invention include computer systems and data processing systems as well as computer readable medium having a computer program for causing a computer to perform any one of the described methods of the present invention.
The purposes, features, and advantages of the present invention will become apparent in the following detailed written description.