The present invention relates to a cache control technique, and more particularly to a technique for selecting data to purge so as to improve cache hit rates.
Cache techniques have been increasingly utilized in many scenes as the hierarchization of memories progresses. Besides, with a recent increase in information processing power, a cache miss has come to be one of reasons for deterioration in performance. Thus, the cache techniques have had a great impact on the performance of network systems.
As an outstanding example, there is a proxy cache apparatus which is set in a network such as the Internet and caches data transferred via the network. A cache hit at the proxy cache apparatus shortens a data transfer route, and accordingly, improves data transfer speed. Thus the proxy cache apparatus reduces response time for transferring data.
FIG. 1 is a block diagram showing an example of a network system employing the proxy cache apparatus. As can be seen in FIG. 1, a proxy cache apparatus 4 acts as an intermediary between one or more servers 1-1 to 1-m and one or more clients 2-1 to 2-n via a network (Internet) 3. The proxy cache apparatus 4 receives a request from a client 2-j (1xe2x89xa6jxe2x89xa6n) by proxy for a server 1-i (1xe2x89xa6ixe2x89xa6m), and sends the request to the server 1-i on behalf of the client 2-j. Having received data from the server 1-i, the proxy cache apparatus 4 transfers the data to the client 2-j. On this occasion, the proxy cache apparatus 4 caches the data. Consequently, when the proxy cache apparatus 4 next receives a request for the same data from a client 2-k (1xe2x89xa6kxe2x89xa6n), the data stored in the memory 4 is sent to the client 2-k. 
As for caching policies employed by cache apparatuses like the proxy cache apparatus 4, there have been proposed a number of caching policies including LRU. One of those policies is described in detail in the technical report: Ludmila Cherkasova, xe2x80x9cImproving WWW Proxies Performance with Greedy-Dual-Size-Frequency Caching Policy,xe2x80x9d Computer Systems Laboratory HPL-98-69 (R.1), November 1998 (hereinafter referred to as reference 1).
Although the theoretically optimum caching policy is to give the lowest priority to data that will be accessed in the most distant future, it cannot be implemented unless all future data accesses are known. Therefore, caching algorithms such as LRU (Least Recently Used) etc. are just approximations of the theoretically optimum algorithm.
Another caching policy is proposed in the paper: E. J. O""Neil, P. E. O""Neil, and G. Weikum, xe2x80x9cThe LRU-K Page Replacement Algorithm for Database Disk Buffering,xe2x80x9d in Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 297-306, 1993 (hereinafter referred to as reference 2). In the LRU-K method, times of the last K references (K: a positive integer) to respective data items are tracked, and the data item whose Kth most recent reference was made least recently is purged. In the case where K=2, namely, in the case of LRU-2, respective data items are given different priorities depending on whether a reference to the data has been made two or more times. Among the data items to which two or more references have been made, the data item whose second last reference was made least recently are given the lowest priority. Naturally, the data to which only one reference has been made is given lower priority than the data which have been referred to two or more times. When compared with LRU that employs only the last access time as information, the LRU-K algorithm uses access times of last K references as information, and thus the LRU-K is a caching algorithm based on more information than LRU.
In another caching policy called LRFU (Least Recently/Frequently Used) policy, which is described in the paper: D. Lee, J. Choi, J. H. Kim, S. H. Noh, S. L. Min, Y. Cho, and C. S. Kim, xe2x80x9cOn the Existence of a Spectrum of Policies that Subsumes the Least Recently Used (LRU) and Least Frequently Used (LFU) Policies,xe2x80x9d in Proceedings of the 1999 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pp. 134-143, 1999 (hereinafter referred to as reference 3), respective data items are given an ordering of priority based on the CRF (combined Recency and Frequency) value. The CRF value C(t) at time t is determined by a weighing function F(x). Assuming that, for example, current time is 8, and that data was accessed at times 1, 2, 5 and 8, the CRF value C(t) is calculated as follows:
C(t)=F(8-1)+F(8-2)+F(8-5)+F(8-8)=F(7)+F(6)+F(3)+F(0).
When many references have been made to data, the computation of priority becomes heavy operations and the volume of information to stock increases. However, if the weighing function F(x) has the F(x+y)=F(x)F(y) properties, then C(t) is derived as follows:
C(t)=F(8-1)+F(8-2)+F(8-5)+F(8-8)=F(3+5-1)+F(3+5-2)+F(3+5-5)+F(3+5-8)=F(0)+F(3)C(5).
Thus, the CRF value can be easily computed from the time of the past reference and the CRF value at that time. Reference 3 indicates that the LRFU policy achieves higher hit rates than the LRU-2 policy does.
There has been proposed yet another caching policy called Early Eviction LRU (EELRU) in the paper: Y. Smaragdakis, S. Kaplan, and P. Wilson, xe2x80x9cEELRU: Simple and Effective Adaptive Page Replacement,xe2x80x9d in Proceedings of the 1999 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pp. 122-133, 1999 (hereinafter referred to as reference 4). This caching algorithm uses the time of the last reference as information as with LRU. EELRU performs LRU replacement and purges a data item that was least recently accessed unless many data items accessed recently had just been purged. If many data items accessed recently have been purged, the e-th most recently accessed data item is purged from the cache. Additionally, e is dynamically adjusted so as to advance hit rates.
Every algorithm LRU-K, LRFU or EELRU achieves higher hit rates as compared to well-known caching algorithms such as LRU, however, these do not work well for proxy caches. In the caching policy applied to the proxy cache, in many cases, caching and purging of data are performed with respect to each data item requested by the clients. Consequently, when a client requests a large data item and the data item is cached, a number of small ones may be evicted to make space available for storing the large data item. In other words, the usefulness of data depends on the size of the data in addition to access patterns. The GDSF policy (Greedy-Dual-Size-Frequency Caching Policy) described in reference 1 is a caching policy that takes into account the data size. GDSF gives reduced priority to large data and thereby raising hit rates. However, when data of multimedia objects, which is large and read sequentially, is stored in the cache, the data (continuous media data) is given low priority. Therefore, the GDSF policy is not adequate to deal with the continuous media data.
A cache memory section is often composed of a plurality of storage media each having different processing speed and different capacity. A general cache memory section includes a high-speed and low-capacity primary memory and a low-speed and high-capacity secondary memory. When the proxy cache apparatus provided with such memory section deals with large data like continuous media data etc., the data is almost always stored in the secondary memory because the size of the data is too large for the capacity of the primary memory. This means that if traffic concentrates on the continuous media data, the speed at which the data is read out of the low-speed secondary memory creates bottlenecks in the processing. Moreover, since the very large continuous media data is exclusively transferred from the low-speed secondary memory, other data cannot be transferred from the secondary memory where a lot of data items are stored. This problem arises not only when feeding data from the secondary memory into the primary memory, but also when feeding data from a remote host into a local secondary memory.
A caching policy for continuous media data is described in the paper: Asit Dan and Dinkar Sitaram, xe2x80x9cA Generalized Interval Caching Policy for Mixed Interactive and Long Video Workloads,xe2x80x9d Multimedia Computing and Networking, pp. 344-351, 1996 (hereinafter referred to as reference 5). The Generalized Interval Caching (GIC) policy of reference 5 exploits the sequential access to continuous media data. That is, although continuous media data is very large, the whole continuous media data is not requested concurrently, but is broken into parts and the parts are requested one by one from the first at a certain rate. Considering each part of the continuous media data separately, when the first part is requested, then following parts will be sequentially requested in time. Therefore, access times of the following parts can be determined on arrival of the first part. Thus, the parts with more recent access time are given higher priority.
Generally, it is difficult to effectively cache data items each having totally different access patterns such as continuous media data and non-continuous media data. For example, even if separate proxy cache apparatuses are provided for the continuous media data and non-continuous media data, respectively, and both the data are distinguished from one another using a layer 7 switch or the like, the proxy cache apparatuses cannot share computing resources. Therefore, when traffic concentrates on the continuous media data, the proxy cache apparatus for the continuous media data becomes overloaded while there are enough computing resources left in the cache apparatus for the non-continuous media data. Moreover, it is costly to use plural proxy cache apparatuses and a layer 7 switch. It is preferable that both continuous media data and non-continuous media data can be stored in one proxy cache apparatus.
A plurality of caching policies may be applied to a cache system. There is disclosed a method of using different caching policies in Japanese Patent Application laid open No. HEI11-65927, in which existing caching policies such as LRU, LFU and GIC can be used concurrently. With this method, it is possible to prioritize respective data items based on plural policies. However, memory space for each of the caching policies is independent, and there is the case where one caching policy does not have available space while the other does. In such case, data processed by the caching policy having no available memory space cannot be stored even if there is enough space to store the data in the whole memory space. In short, although this method is costly superior to the method of using plural proxy cache apparatuses and a layer 7 switch, computing resources are still not shared. Thus, it is difficult to deal with requests for data that changes dynamically in popularity by the method.
As is described above, there is a problem in the conventional techniques of the references 1 to 5 that data is managed based on one policy, and plural types of data having different access patterns, for example, continuous media data and non-continuous media data cannot be effectively cached. The method described in Japanese Patent Application laid open No. HEI11-65927 solves the problem to some extent. Namely, it is possible to cache each type of data effectively by applying the GIC policy to the continuous media data and the LFU policy to the non-continuous media data.
In the method, however, there are provided individual memory spaces of different policies and each type of data is fixedly allocated to one of them. Consequently, if traffic concentrates on a certain type of data, the memory space for the type of data is led to a deficiency in free space and a purge of data is executed frequently even when other memory space has enough space.
It is therefore an object of the present invention to obtain high cache hit rates even when multiple types of data are requested and the ratio of requests for each type of data to all requests is dynamically fluctuated by effective allocation of resources.
In accordance with the present invention, to achieve the above object, data items stored in a cache memory section are divided into groups of data each having a different access pattern, and the data items are prioritized within each group by using an individual caching algorithm. When it is needed to purge a data item from the cache memory section, a data item, which is given the lowest priority based on the caching algorithm of its group, is purged from the lowest priority group determined by the prescribed evaluation standards. With this construction, while data are managed in each group based on individual caching algorithms, it is possible to select the lowest priority data item from among all data items in all groups stored in the memory section. In addition, the memory space is not fixedly allocated to each group. Accordingly, even in the case where traffic concentrates on data items in a certain group and there are few accesses to data in the other group, more memory space is available for caching the data of the certain group by reducing the memory space allocated to the other group, and thus realizing higher cache hit rates.