The invention relates generally to retaining objects in cache and, more particularly, to methods and systems for implementing a protocol for replacing cached objects with recently received objects.
With the growth of the World Wide Web, an increasingly large fraction of available bandwidth on the Internet is used to transfer Web documents. Access to the Web documents is generally structured around the HyperText Transfer Protocol (HTTP), which is a request-and-response protocol. When a user at a client device, such as a personal computer, designates a particular Web page, at least one request is generated. The number of requests is dependent upon the sophistication of the designated Web page. Often, a Web page is formed of a number of files, such as text files, graphics files, audio files and video files. Each one of the files is referred to as an xe2x80x9cobject.xe2x80x9d A multi-object page is aesthetically pleasing, but each object requires a separate request and a separate response. Therefore, each request-and-response round trip time plays a role in determining the total time a user must wait to view the complete designated Web page.
The total latency in downloading a Web page or other Internet document (e.g., a FTP file) depends upon a number of factors, including the transmission speeds of communication links between a client device and a server on which the file is stored, delays that are incurred at the server in accessing the document, and delays incurred at any-device located between the client device and the server. The intermediate devices may include proxies and routers. If there are a number of objects embedded within a Web page, the delays occur for each object.
Web proxies serve as intermediaries between browsers on a client side of an Internet connection and servers on the opposite side. An important benefit of a Web proxy is the ability of the proxy to cache objects. The caching operations of the Web proxy will be described with reference to FIG. 1. When a client device 12 generates a request 14 for a particular object, the cache of a proxy 16 is searched to determine whether the object is stored at the proxy. If the object is not found in the cache, the request is directed to a server 18 via the Internet 20. In FIG. 1, the requested object 10 is indicated in phantom. As an example, the object 10 may be a HTML file. A response 22 from the server is directed through the proxy 16 to the client device 12. Preferably, the object 10 that is contained within the response 22 is cached at the proxy 16. At a later time, either the same client device or a different client device 24 may generate a request 26 for the same object. The object 10 is in the cache of the proxy, allowing the object to be forwarded to the client device 24 directly from the proxy, as shown by response 28. This eliminates delays encountered in communicating between the proxy 16 and the server 18.
The first request 14 resulted in a xe2x80x9ccache miss,xe2x80x9d since the requested object 10 was not retained in cache of the proxy 16. On the other hand, the second request 26 resulted in a xe2x80x9ccache hit.xe2x80x9d By storing copies of objects, the proxy 16 can reduce the number of requests that are directed to servers 18, as well as the volume of traffic on Internet backbones as a result of transmitting the responses in the form of a number of packets that must be reassembled at the client device 12.
Ideally, the cache at the proxy 16 can retain all of the objects that are transferred through the proxy. However, the typical storage capacity for the proxy is in the range of 256 megabytes to 1 terabyte, with most Web proxy capacity being at the lower half of the range. Therefore, it is important to form a replacement strategy for determining which objects are evicted from cache when a recently received object is to be cached within exhausted storage space. Two important metrics that are used to measure proxy cache performance are cache hit rate and byte hit rate. The cache hit rate is the percentage of all user requests 14 and 26 that are satisfied by the proxy 16, rather than by access to the original server 18. Byte hit rate is the percentage of all network traffic, measured in bytes, transferred directly from the proxy cache, instead of across the external network.
There are a number of replacement strategies that have been proposed by the scientific community with regard to Web proxy caching. Some of the strategies are relatively simple and easy to implement, while others rely heavily upon setting parameters and are difficult to implement. A well organized survey of currently known Web replacement strategies is provided by Pei Cao and Sandy Irani in an article entitled, xe2x80x9cCost-Aware WWW Proxy Caching Algorithms,xe2x80x9d Proceedings of USENIX Symposium on Internet Technologies and Systems, Monterey, Calif., pages 193-206, December, 1997. The article describes ten previously known replacement algorithms.
According to the least-recently-used (LRU) algorithm, when an eviction is required in order to store a recently received object, the previously cached object that was requested least recently is evicted. This is a traditional strategy and operates well for CPU caches and virtual memory systems. However, it does not work as well for proxy caching, since time accesses for Web traffic often exhibit very different patterns. For example, some Web pages may be popular only during certain times of the day or certain days of the month.
A second known strategy is the least-frequently-used (LFU) algorithm that replaces the object which has been accessed the least number of times. This strategy attempts to keep more popular objects and replace rarely used objects. However, some objects can build a high frequency count over a short period of time and be rarely accessed after the subject matter is no longer xe2x80x9chot.xe2x80x9d Such objects often remain within cache long after network performance is enhanced by retaining the documents in cache. The traditional LFU strategy does not provide any mechanism to remove such documents, leading to xe2x80x9ccache pollution.xe2x80x9d Typical examples are objects of a Web site dedicated to a one-time, high-profile event.
A third strategy is to evict the largest document stored in cache. This size strategy attempts to maximize the cache hit rate by evicting one large object, rather than a number of small objects. However, some of the small objects may never be accessed again. This third strategy does not provide any mechanism to evict such documents, leading to cache pollution.
A fourth strategy identified in the Cao and Irani article is referred to as an LRU-threshold strategy. This strategy is equivalent to the LRU policy, but it does not cache documents larger than a certain threshold size.
Another refinement of the LRU strategy is the log (size) +LRU strategy that replaces the document which has the largest log (size) and is the least recently used among the same log (size) documents. A hyper-G strategy is a refinement of the LFU strategy with last access time and size considerations. Yet another strategy is referred to as the Pitkow/Recker strategy that replaces the least recently used document, unless all of the documents were accessed on that particular day. In this case, the largest document is replaced. This strategy attempts to monitor the daily time access patterns specific to the Web documents. This replacement strategy has been proposed as one to run at the end of a day, in order to free space occupied by xe2x80x9coldxe2x80x9d least-recently accessed documents.
An eighth strategy is the lowest-latency-first policy that removes the document with the lowest download latency. The strategy is directed to minimizing average latency.
A ninth identified strategy is a hybrid policy that also targets reducing the average latency. For each object, a utility value of retaining the object in cache is computed. Objects with the smallest utility value are replaced. The utility value is designed to capture the utility of retaining a given object in the cache. The value is based upon a number of factors, including the time to connect with the server, the bandwidth of the server, the number of times that the object has been requested since it was brought into the cache, and the size of the object.
The last strategy identified in the Cao and Irani article is the lowest relative value (LRV) strategy that includes the cost and size of an object in the calculation of a value that estimates the utility of keeping a document in cache. The replacement algorithm evicts the object with the lowest value.
A refinement of the LFU strategy not identified in the Cao and Irani article has been proposed for caching, but not specifically proxy caching. The strategy places two limitations on the counts of object requests in the caching LFU algorithm: AMaX, which places an upper limit on the average request count for all objects in the cache; and MRefs, which imposes an upper limit on the request count that can be assigned to a single object. Whenever the average request count for objects in the cache surpasses AMaX, the request count of each object in the cache is reduced by a factor of two. While this refinement may improve proxy performance, further improvements are desired. Implementation of the strategy requires a reoccurring xe2x80x9cwalk throughxe2x80x9d of the entire cache storage space in order to adjust the request count of each cached object. Thus, the processing requirements can be significant.
Cao and Irani propose a replacement strategy for web proxies which incorporates size of the cached objects into a previously known Greedy-Dual algorithm. The original Greedy-Dual algorithm dealt with the case in which pages in a cache (memory) had the same size, but had different costs to fetch them from the storage. The original algorithm associated a value H with each cached page p. Initially, when a page was brought to the cache, H was defined to be the cost of bringing the page into the cache (i.e., H=Cost). When a replacement was needed, the object with the lowest H value, minH, was replaced and all of the H values of the remaining objects were reduced by minH. If an object was accessed again, its current value H was restored to the original cost of obtaining the object to the cache. Thus, the H values of recently accessed pages maintained a large amount of the original cost, compared to objects that had not been accessed for a significant period of time.
In comparison, the Greedy-Dual-Size algorithm introduced the size of the object into the determination of the H value, so that H=Cost/Size, where Size is measured in bytes. To provide a high cache hit ratio, the Cost function of each object may be set to 1. In such a way, larger objects have a smaller H value than small objects, and are therefore more likely to be replaced if they are not referenced again in the near future. Setting the Cost to 1 favors small documents over large documents, especially those that are rarely referenced. Because there are a larger number of objects within the cache, the cache hit ratio will remain relatively high. However, the byte hit ratio may be sacrificed. A relatively high cache hit ratio and high byte hit ratio may be obtained by setting the Cost function for each object to 2+Size/536, which is the estimated number of network packets sent and received to satisfy a cache miss for the requested object. This setting of the Cost function provides a greater H value for larger objects than for smaller ones. It allows the objects having a small size to be replaced more readily than the larger objects, especially if the larger objects are often referenced. If a large object is seldom referenced, it will be replaced as a result of the steady decrease in the H value.
The Greedy-Dual-Size algorithm out performs previously known caching strategies with regard to a number of performance metrics, including cache hit ratio and byte hit ratio. However, the Greedy-Dual-Size algorithm does have at least one shortcoming, namely, that the algorithm does not take into account how many times the cached object was accessed in the past. An example is provided in how the Greedy-Dual-Size algorithm handles a hit and a miss for two different documents of the same size. When initially the documents are brought to the cache, both documents receive the same value of H=Cost/Size. The first document (doc 1), which was accessed a number of times in the past, will have the same value as the second document (doc 2) that is accessed from cache for the first time. Thus, in a worst case scenario, the frequently accessed doc 1 will be replaced, instead of the once-used doc 2, when it is time to replace one of the documents with an object having a higher H value.
What is needed is a method for systematically caching objects and replacing cached objects such that currently popular requested objects are likely to be stored in local cache but previously popular objects are readily replaced, thereby providing high cache hit and byte hit ratios. Ideally, such a method should have application not only to proxy cache management, but also to other types of cache management as well. The present invention solves these needs and provides further related advantages.
The present invention provides a method and a system for caching objects and replacing cached objects in an object transfer environment using a dynamic indicator for each object; the dynamic indicator is dependent upon frequency of requests for the object as well as upon time of storing the cached object relative to other cached objects. In one embodiment, frequency of requests is a factor that allows the dynamic indicator to exceed its original value, with relative time of storage providing a balance that reduces the possibility of cache pollution. In another embodiment, the size of the object is a factor in determining the dynamic indicator, and still further, in another embodiment, cost of obtaining the object (e.g., use of network resources) is also used as a factor.
The method and the system may be used to establish a replacement strategy for caching objects in local cache of a proxy connected to the Internet. The objects are received from servers via the Internet, and are directed to target devices, such as personal computers. The method and system may also be used in other cache management environments, such as for CPU cache management (e.g., xe2x80x9cdata compactionxe2x80x9d for removing objects from localized cache). In at least some of these applications, objects can selected for eviction from cache even when the capacity of cache has not been reached, such as during times of processor inactivity.
In one embodiment, the dynamic indicator (Pr) of an object (f) is a function of the current relative storage time (Clock) of the object, the size of the object (Size(f)), the frequency of requests (Fr(f)) for the object, and the cost of the object (Cost(f)). In this embodiment, the dynamic indicator may be determined such that                               Pr          ⁡                      (            f            )                          =                  Clock          +                                    Fr              ⁡                              (                f                )                                      xc3x97                                                            Cost                  ⁡                                      (                    f                    )                                                                    Size                  ⁡                                      (                    f                    )                                                              .                                                          (                  EQ          .                      xe2x80x83                    ⁢          1                )            
The indicator Pr(f) is xe2x80x9cdynamicxe2x80x9d with respect to the parameter Fr(f), since the value Fr(f) is adjusted each time that the object (f) is requested. The parameter Clock is a dynamic value that preferably is increased each time an object is evicted from local cache, and that is fixed for each object during storage in local cache. That is to say, later-cached objects will have a higher value of Clock (and therefore Pr(f)) than earlier cached objects. The clock function, therefore, is indicative of the xe2x80x9cstrengthxe2x80x9d of previously evicted objects. As a result of this mechanism, objects that have not been accessed for long periods of time are susceptible to eviction even if the frequency counts of the objects are high. Optionally, the determination of the dynamic indicator can be modified by including a coefficient (Alpha) as a multiple of the File frequency (Fr(f)).
If desired, the dynamic indicator and its determination can be simplified by eliminating one or both of the Cost and Size functions. If both of the functions are removed from the determination of Pr(f), the Clock and frequency functions will remain to provide a balance that favors current popularity over past popularity. For example, if the Size function is removed, but the Cost function is utilized, the algorithm (i.e., Pr(f)=Clock+Fr(f)xc3x97Cost(f)) operates well to provide a solution for conventional storage and cache paging concerns, in addition to use within a network environment.
In the practice of the method, if a request for the object (f) is received at a proxy from a client device, the local cache is searched to determine if the object is stored in local cache. For a cache hit, the object is served to the requesting client device. After the value of Fr(f) is adjusted, the dynamic indicator for the object (f) is re-computed. On the other hand, if the request results in a cache miss, the object is accessed from the remote server on which the object is stored. The object is served to the client device and a calculation of the dynamic indicator is initiated with using a frequency indication of xe2x80x9cone,xe2x80x9d i.e., Fr(f)=1. A determination may then be made as to whether a replacement of a cached object is dictated by the caching strategy. This determination may involve a number of factors (e.g., factors that act to prevent the object from replacing a cached object that is likely to subsequently have a greater frequency count), or may merely involve a calculation of whether the object can be cached without replacing a previously cached object. When a replacement is dictated, an identification may be made of which file or files are to be evicted.
To prevent Clock and the frequency count Fr(f) from entering an overflow situation, an offset procedure may be implemented; when Clock has reached a predefined upper limit, an offset may be computed and the next assigned Clock value is reduced by the offset. Simultaneously, the dynamic indicators of all of the cached objects may be reduced by the offset.
To prevent the object frequency count from overflow and to guarantee the algorithm properties and performance, the count preferably stops at an upper limit (e.g., 108). Such a high value of the object""s frequency count will generally guarantee that the dynamic indicator for the cached object will inhibit replacement of the object. On the other hand, if the object frequency count has reached the upper limit and it has not been accessed for a long time, eventually Clock will exceed the upper limit, and all the newly accessed objects will have higher Pr(f) values, allowing the replacement of the cached object. Such an approach prevents the cache from becoming polluted with xe2x80x9chotxe2x80x9d files which become xe2x80x9cout of interestxe2x80x9d and rarely or never are accessed again.