1. Technical Field
The present application is generally related to a system and method for sharing and caching information in a data processing system and, in particular, to a system and method for managing a cacheable state which is shared between processes and clones in a computer network.
2. Description of Related Art
Caching is a technique that is typically employed in many computer systems to improve performance. For example, in an object-oriented environment, caching an object can minimize the cost for fetching or materializing an object since it is only incurred once. Subsequent requests can be satisfied from the cache, a step which incurs significantly less overhead, thus resulting in improved performance overall.
A key problem associated with caching items is that of preventing each cache from supplying stale data. Cached data become stale whenever actual values have changed but cached copies of these values have not been updated to reflect the changes. Since it is most undesirable to supply stale data, caches typically purge stale data and request a fresh copy from the data source. This replenishment incurs the usual overhead that employment of caches seeks to minimize.
A xe2x80x9cmodelxe2x80x9d is a template for creating additional, nearly identical copies of a server or process instance, such as an application server or servlet engine. Such copies are called xe2x80x9cclonesxe2x80x9d. The act of creating clones is called cloning. A clone (or cloned process) is a special case of a process. Such processes and clones comprise many computer systems. Cloning allows multiple copies of the same object to behave together as if they were a single image, with the idea that clients experience improved performance. More specifically, processes and clones often perform particular tasks and communicate with other process and clones performing the same or other tasks. There are various benefits associated with having separate processes and clones perform individual tasks, including but not limited to reusability, understandability, and efficiency.
Processes and clones often utilize caches to improve performance. A key problem with separate processes and clones is that of keeping their caches consistent. A process or clone cache becomes inconsistent when changes occur that affect the cache, either within the process or clone itself or within another process or clone.
In order to alleviate the problem of unusable cached data at request time among processes and clones, there is a need for a mechanism to efficiently coordinate changes that would cause a cache to become stale. Also, there is a need to effectively measure the costs and benefits associated with keeping distributed caches synchronized and to determine optimal entity cacheabilities.
The present invention is directed to a system and method for sharing and caching information in a data processing system and for efficiently managing a cacheable state shared among processes and clones. In one aspect of the present invention, a method for managing a plurality of caches distributed in a network, comprises the steps of:
maintaining, by each cache, a plurality of statistics associated with a cacheable object, wherein the statistics associated with the cacheable object comprise an access frequency (A(o)), an update frequency (U(o)), an update cost (C(o)), and a cost to fetch the cacheable object from remote source (F(o));
computing, by each cache, an associated metric using said statistics, wherein the associated metric quantitatively assesses the desirability of caching the cacheable object; and
utilizing, by a given cache, the associated metric to make caching decisions associated with the cacheable object.
In another aspect of the present invention, a method for managing a cacheable state shared between a plurality of caches associated with communicating processes, comprises the steps of:
caching, by a process, an object in a cache associated with the process;
maintaining, by the process, dependency information associated with the cached object;
utilizing, by the process, the dependency information of the cached object to determine if the cached object is affected by a change to underlying data; and
updating all caches associated with other processes containing the cached object, if the cached object is affected by a change to underlying data.
In yet another aspect of the present invention, a system for managing networked caches, comprises:
a plurality of communicating processes, wherein each communication process is associated with a cache; and
a cache manager, associated with each process, for managing the associated cache and maintaining dependency information for each object stored in the cache, wherein the cache manager is adapted to share the dependency information among other communication processes.
These and other aspects, features, and advantages of the present invention will become apparent from the following detailed description of the preferred embodiments, which is to be read in connection with the accompanying drawings.