This invention relates to the field of computer systems. More particularly, the invention provides a caching system for serving dynamic data and a method of operating the system to promote a desired level of performance.
Caching systems are often employed to enable faster responses to data requests, especially where the data being requested is stored on a relatively slow device (e.g., disk, tape). A caching system can improve performance by storing all or a portion of the data in a faster device (e.g., random access memory).
Existing caching systems are most suited for those environments in which the requested data is relatively static and/or is not the subject of heavy traffic or usage. In particular, existing systems may provide adequate benefits when the cached data need not be updated on a recurring or regular basis. Unfortunately, such systems are ill-suited to maintaining desired levels of performance when the requested data is dynamic in nature, particularly when the number or frequency of data requests is high. For example, on the Internet an enormous number of users request dynamic content in the form of news stories, financial data, multi-media presentations, etc., and may do so through customized user interfaces containing dynamic components. In particular, many sites or web pages accessed by users contain data that is updated or replaced on a regular basis.
For high-volume, dynamic environments such as the Internet, existing caching systems are not designed to maintain a steady level of performance. Instead, such environments are generally configured to maintain a consistent level of data quality, typically by attempting to always provide the newest or more recent version of requested data. Thus, when a master copy or version of data that is cached is altered or replaced, the version in the cache must be updated or replaced before the faster cache can once again be used to satisfy users"" requests. Until the cache is updated, requests for the data must be satisfied from a slower device (e.g., where the master copy is stored). Thus, during heavy periods of traffic or when a large amount of cached data must be replaced, data requests cannot be served from the cache and, unless the web site maintains a sufficient number of alternative, slower, devices to respond to the requests, performance of the web site may decline precipitously.
As a result, a web site operator is faced with a quandary. The operator may employ a sufficient number of slower devices to handle an expected or peak level of traffic, in which case the caching system is superfluous. Or, the operator must be willing to allow performance to be degraded, possibly severely.
Therefore, what is needed is a caching system and a method of operating the caching system in an environment characterized by dynamic data and/or high volumes of data requests, wherein a desired level of performance (e.g., response time to data requests) can be substantially maintained during peak or high traffic periods without obviating the need for the caching system. What is also needed is a method of performance-based caching in which data consistency varies as little as possible.
In one embodiment of the invention a system and methods are provided for caching data in a manner that promotes a desired level of performance, as measured by response time for data requests, load placed on a system component (e.g., number of open connections), or some other parameter. In this embodiment the consistency of the data provided in response to a request may be allowed to fluctuate, by providing stale data for example, in order to promote the desired level of performance.
According to one embodiment, a caching system comprises a cache for storing copies of data items stored on a data server. The cache may be part of a separate cache server or may be combined with the data server. Generally, data can be provided from the cache server faster than it can be served from the data server. As long as the data in the cache accurately represents the data on the data server, the cached data is served in response to user requests for the data. When contents of the cache become invalid (e.g., stale or obsolete) because corresponding data on the data server changes or is replaced, the cache must receive the updated or replacement data before user requests can receive the new data from the cache server.
In one embodiment of the invention, when a request is received at a cache server for data that has been invalidated, the caching system may first determine whether the present or desired performance of the system (e.g., number of open connections, average or incremental response time, throughput, etc.) allows the request to be passed to the data server that stores an updated or current version of the data. The action taken in response to the request may also depend on factors such as the popularity of the requested data (e.g., how frequently or recently it has been requested) and/or its level of invalidity (e.g., how long ago it was invalidated, the severity of invalidation, when the data was last updated or replaced). Different embodiments of the invention may weigh the operative factors differently.
For example, when a request for data that is invalid on the cache server is passed to the data server because the requested data is popular and/or highly invalid, subsequent requests for the same data may be satisfied at the cache server using an invalid version. Conversely, when less popular and less invalid, but still invalid, data is requested from the cache server, an invalid version of the data may be returned. In addition, however, a lower priority request for an updated or replacement version of the data may be passed to the data server.
In one alternative embodiment of the invention, after a cache entry is invalidated and until replacement data is cached, user requests for the replacement data may be selectively satisfied with invalid data. User requests may, alternatively, be held to await the replacement data or may be passed to the data server for satisfaction. In particular, at least one request for the new or replacement data may be passed to the data server in order to retrieve the data and store it in the cache (in addition to providing it to the requesting user). Subsequent requests may be held by the cache server to wait for the new data or may be satisfied using cache data that was invalidated. The cache server may consider the current performance of the system in determining how to satisfy a request and/or may consider how the system performance would be impacted by passing the request to the data server. In another embodiment of the invention cached data items may be refreshed not only for current, or pending, data requests, but may also be refreshed in anticipation of future data requests.
In one particular embodiment of the invention the currency or validity of a cached data item may depend upon factors such as: a desired level of system performance (e.g., a target response time to user data requests), one or more characteristics of the data item (e.g., how popular it is, the cost of refreshing it, how valuable the data is), and an allowable rate or number of refreshes that may be performed. Depending upon these factors, a particular data request may be satisfied from a cache memory (e.g., even if the cached data item is invalid) or from a data server or other primary storage device.