The present invention relates to cache devices, and to methods and computer programs for controlling cached data and, in particular, relates to a cache device, and to a method and a computer program for controlling cached data that enable efficient use of a data storage area in the cache device and that improve the hit ratio in cacheing data that is sent and received through networks.
Recently, a large amount of image and voice data, various types of programs, and the like has been transmitted through networks such as the Internet. The types of content transmitted through networks, which include text and still images, have extended to multimedia content such as moving images and voice, so that the volume and types of content have increased considerably.
Large-volume content delivery through broadband networks has rapidly increased. Accordingly, the amount of content that is delivered through networks has rapidly increased. When the types of content increase and large-volume content increases, data communication volume disadvantageously increases; that is, data traffic in networks increases.
One method for solving the problem of increased data traffic in networks is a cache system. A cache system includes caches functioning as data storage areas on network paths extending from a content delivery site to users that receive the content. When user-requested data (content) is stored in the caches, the data (content) is not sent from the content delivery site, but instead is retrieved from the caches and sent to the users.
When, for example, a user retrieves content through networks and the content is stored in a temporary storage device (cache) on the networks, the cache stores and deletes the content according to predetermined rules. For example, infrequently accessed content in the cache is replaced every time up-to-date content is input to the cache up to the storage capacity of the cache. In general, these rules for controlling the content of the cache are, in effect, independent of direct controls by both the content delivery side and the user side.
In controlling a data area of an Internet cache in this way, when data occupies the full space of the data area, an inactive region of the data area is detected and the data in the inactive region is deleted and replaced with up-to-date data. This process is called cache replacement.
In known cache replacement in an Internet cache, when a number of caches exist in networks such as the Internet, independent data control is generally carried out in each cache device. The most common cache replacement method is a data control method using the least recently used (LRU) algorithm.
The LRU algorithm is a control method registering a data block that has not been used recently at the end of an LRU list and registering a data block that has been used recently at the head of the LRU list. In this method, when old data is deleted to cache new data, a new data storage area is secured by deleting the data block registered at the end of the LRU list.
Referring to FIG. 1, the data control method using the LRU algorithm will now be described. The data used in the control are an LRU list (a) having entries, in the order of usage frequency, corresponding to individual data blocks, and a free block list (b) indicating free area blocks in a cache. Newly cached data is stored in a data storage block, serving as a memory region, which is pointed to by a pointer indicated in each entry of the free block list (b). An entry that stores block information of the data storage block is registered as a head entry 101 in the LRU list (a).
When some of the previous cached data is deleted because the data storage area of the cache is fully occupied by inputting (caching) the new data, a data block that has not been used recently, corresponding to an entry 102 at the end of the LRU list (a), is deleted. A certain data storage block is registered as a free area by registering an entry indicating pointer information of the deleted data block in the free block list (b).
The above method for controlling data blocks for a cache is applied using information of a single cache device. A method for requesting data that does not reside in a requesting cache device to a number of neighboring cache devices has been proposed as the Internet Cache Protocol (ICP: Requests for Comments RFC-2186, RFC-2187). In this method, the data request process is carried out, specifying a uniform resource locator (URL) serving as resource information of a content file, for each URL, wherein there is no function that minutely adjusts which data of a number of caches is deleted or kept. Thus, the requested data is simply retrieved from other cache devices instead of being downloaded from a server when the same data as the requested data happens to remain in the other cache devices as a result of respective operations of these cache devices.