The present invention is related generally to a data cache and, more particularly, to a distributed data cache whose file storage can be controlled in accordance with a record expiration model.
A data cache is a well-known tool for the temporary storage of data. Typically, the data is downloaded from a data source into the data cache and temporarily saved for subsequent use thereby avoiding the need to download the data again from the data source. For example, a data cache may be used for the temporary storage of data downloaded from an Internet web site. In this example, a computer, such as a conventional personal computer (PC) executes a web browser application program. Data may be downloaded from a web site for display on the PC. The data is stored in a data cache within the PC for subsequent display on the PC so as to avoid having to access the web site a second time to download the same data. The data caching process greatly enhances the speed of operation by eliminating the need to download data a second time.
Computers, such as a PC, workstation, or the like, are frequently connected to other computers to form a computer network. Each portion of the computer network may include its own data cache as an integral part of an application program(s) that may be executing on the particular computer. Unfortunately, these data caches are accessible only through the particular application programs being executed on each computer and are not available for use by other portions of the computer network.
Therefore, it can be appreciated that there is a significant need for a distributed data cache having a general form that can be accessible by any portion of the computer network. The present invention provides this and other advantages, as will be apparent from the following detailed description and accompanying figures.
A distributed cache system may comprise a local cache and an array of remote caches. The cache itself may be implemented using any suitable data structure for storing and retrieving data items. A cache controller associated with each cache independently controls the operation of its respective cache. In one implementation, a first data structure is used to implement a data cache and store data in association with the first computer (which may sometimes be characterized herein as a computer platform or computers system). A plurality of remote computers are separate from the first computer and are coupled to the first computer. Each of the remote has an associated data structure to store data. Each of the data structures has an associated first utilization list that contains data corresponding to data items stored within the respective data structure. The first utilization list indicates a sequence in which the data items were stored within the respective data structure.
The first data structure and each of the remote data structures each have an associated second utilization list containing data corresponding to data items stored within the respective data structure. The second utilization list contains data indicating a sequence in which data items were retrieved from the respective data structure wherein utilization data in the second utilization list is removed from the first utilization list. Thus, the first utilization list for each data structure contains data indicating the sequence in which data items were stored in the associated data structure while the second utilization list contains data indicative of a sequence in which data items were retrieved from the associated data structure. The first data structure and each of the remote data structures each have an associated cache controller to delete data items from the respective cache.
In one embodiment, the cache controller deletes data items from the respective data structures based on the first and second utilization lists associated with the respective data structure. For example, the cache controller will remove data items from a respective data structure based on the first utilization list by deleting a data item corresponding to utilization data indicating that the data item was the oldest stored item within the first utilization list. If no data is within the first utilization list, the cache controller may delete a least recently used (LRU) data item based on the utilization data within the second utilization list.
In another embodiment, a data structure may have a plurality of first utilization lists associated therewith. The data in each of the plurality of first utilization lists corresponds to a portion of the data items stored within the data structure and is further indicative of a sequence with which the portion of corresponding data items were stored within the data structure.
A first indicator is used to select one of the plurality of first utilization lists that will store the data corresponding to the data items stored within the data structure. A second indicator is used to select one of the plurality of first utilization lists. A cache controller associated with the data structure can delete data items from the data structure. The cache controller will delete a data item based on data contained within the one of the plurality of first utilization lists selected by the second indicator.
In an exemplary embodiment, the first indicator is altered after a data item is stored within the data structure to select a different one of a plurality of first utilization lists. Similarly, the second indicator may be altered following the deletion of data from the data structure to select a different one of the plurality of first utilization lists. In one embodiment, the first and second indicators select different selected ones of the plurality of first utilization lists.