1. Field of the Invention
The present invention relates to a computer-based memory system, and, more particularly, to methods for providing cache refresh within a finite time window with predictable accuracy and constrained resources.
2. Description of the Related Art
Performance-oriented and disconnected applications are typically executed on caching-based computer systems. Caching occurs any time content or resources are retrieved, assembled or created, and then stored for later use. Caching often delivers increases in overall performance. Furthermore, it reduces the dependency on resource availability (e.g., the network, the cached resource).
An issue computer architects and developers struggle with is maintaining data freshness. The data being cached generally has a limited, useful life span. That is, at some point, the data in the cache may no longer accurately reflect the data at the source. At this point, the invalid data (i.e., the data in the cache) can either be purged, or it can be refreshed.
There are a number of known approaches to purging or refreshing the invalid data. First, an application may systematically refresh the invalid data given a satisfied condition (e.g., time, access). Second, a more advanced solution may provide a messaging-based solution where caches listen and post changes to a common message bus. A source may submit data changes to the message bus, or the caches can potentially provide refresh and publish functionality. Third, databases may have a timestamp field to enable querying of recently changed objects. This type of query often helps with synchronization, and is often referred to as polling. With polling, the source data is checked periodically to see if it has changed, and the source data is pulled only if it has changed. Fourth, the Hypertext Transfer Protocol (“HTTP”) specification may attempt to address the issue of purging or refreshing the invalid data using special meta tags called cache control headers, as specified by the HTTP protocol specification. The cache control headers are directed to browsers and proxy servers to specify how long to cache the resource and when to check for a new resource.
Each of the above approaches share the same problem—there may be more cache items to refresh than there are resources (e.g., CPU, memory, network bandwidth) or time to actually refresh the cache item. Specific challenges arise when dealing with large caches employing a time-based refresh strategy. In a time-based refresh strategy, the items in a cache are refreshed after a specific period of time. This time period is often referred to as the “time of usefulness” of the data. A problem occurs when the cache contains more items than are possible to refresh within a specified time window. Once the cache refresh misses a targeted window, the accuracy of the data becomes exponentially out-of-date because the data expires faster than the rate of refresh.