This invention relates to the field of computer systems. More particularly, the invention provides a multi-tiered caching system and a method of operating the system to serve data requests having a range of complexity.
Caching systems are often employed to enable faster responses to data requests, especially where the data being requested is stored on a relatively slow device (e.g., disk, tape). A caching system can generally improve performance by storing all or a portion of the data in a faster device (e.g., random access memory).
In today""s computing environments, software-managed caches may be implemented within an operating system or as part of an application program running above the operating system. A cache implemented within the operating system may take advantage of faster access to storage devices, while a cache implemented in an application program usually performs slower due to the processing overhead added by the application coding and the operating system. However, implementing a cache as part of an application program may be easier because one can take advantage of the utilities and protection (e.g., memory management) offered by the operating system. These protections are typically not available to an operating system cache.
Thus, an application program cache may be relatively easy to implement, but have relatively low performance in comparison to an operating system cache, while the operating system cache is more difficult to implement but yields greater performance. Present caching systems tend to implement one or the other of these two types of caches, depending upon a desired level of performance and/or an acceptable amount of design effort.
Further, existing caching systems are most suited for those environments in which the requested data is relatively static and/or is not subject to heavy demand. In particular, existing systems may provide adequate benefits when the cached data need not be updated on a recurring or regular basis. Unfortunately, such systems are ill suited to maintaining desired levels of performance when the requested data is dynamic in nature, particularly when the number or frequency of data requests is high. For example, on the Internet an enormous number of users request dynamic content in the form of news stories, financial data, multi-media presentations, etc., and may do so through customized user interfaces containing dynamic components. In particular, many sites or web pages accessed by users contain data that is updated or replaced on a regular basis.
For high-volume, dynamic environments such as the Internet, existing caching systems are not designed to maintain a steady level of performance. Instead, such environments are generally configured to maintain a consistent level of data quality, typically by attempting to always provide the newest or more recent version of requested data. Thus, when a master copy or version of data that is cached is altered or replaced, the version in the cache must be updated or replaced before the faster cache can once again be used to satisfy users"" requests. Until the cache is updated, requests for the data must be satisfied from a slower device (e.g., where the master copy is stored). During heavy periods of traffic or when a large amount of cached data must be replaced, data requests cannot be served from the cache and, unless the web site maintains a sufficient number of alternative, slower, devices to respond to the requests, performance of the web site may decline precipitously.
As a result, a web site operator is faced with a quandary. The operator may employ a sufficient number of slower devices to handle an expected or peak level of traffic, in which case the caching system is superfluous. Or, the operator must be willing to allow performance to be degraded, possibly severely.
Therefore, what is needed is a caching system that can take advantage of the relatively easy development afforded to application program caches and also take advantage of the performance enhancement offered by operating system caches. More particularly, what is needed is a multi-tier caching system that incorporates both types of caches and is flexible enough to store suitable data in each type of cache.
What is also needed is a caching system and a method of operating a caching system in an environment characterized by dynamic data and/or high volumes of data requests, wherein a desired level of performance (e.g., response time to data requests) can be substantially maintained during peak or high traffic periods without obviating the need for the caching system. Also needed is a method of performance-based caching in which data consistency varies as little as possible while promoting a desired level of system performance.
In one embodiment of the invention a multi-tier caching system is provided in which a first cache is implemented in kernel or operating system space (e.g., as part of the operating system) and a second cache is implemented in user or application program space (e.g., as part of an application program).
In this embodiment the operating system space cache is designed to store and serve relatively simple data in response to fundamental or basic data requests, such as those that require a minimum amount of processing or examination. Data that is responsive to more complex requests may be stored in the application program space cache. Complex requests may require one or more parameters included in the data request to be examined in order to identify the data that should be served in response.
An analysis engine may operate under a set of guidelines or rules to determine what data should be stored in which type of cache, and/or route data requests to one cache or the other. Thus, in one method of operating a multi-tier caching system data requests are routed to an operating system space cache or an application program space cache depending upon the complexity of the data request. In another method, the analysis engine determines which cache a data item should be stored in when it is retrieved or received from a mass storage device or a data server. In determining which cache to use to store a data item, the analysis engine may consider the request that lead to the data retrieval (e.g., how complex the request is, whether the data is responsive to multiple requests), may apply some historical analysis concerning past requests, may apply guidelines or hints specified by a system administrator, etc.
A data item stored in one cache of a multi-tier caching system may migrate to a different cache depending upon its popularity (e.g., how often or frequently it is requested), age, size, invalidity or some other characteristic.
One embodiment of the invention is particularly suited for use in a network environment, such as that of the Internet, in which http (HyperText Transport Protocol) or other requests are received with various headers, qualifiers, parameters or other indicia that must be digested before the appropriate data can be identified and served.
In one alternative embodiment of the invention a system and methods are provided for caching data in a manner that promotes a desired level of performance, as measured by response time for data requests, load placed on a system component, number of open connections, or some other parameter. In this embodiment the consistency of the data provided in response to a request may be allowed to fluctuate, by providing stale versions of the requested data for example, in order to promote the desired level of performance.
In this alternative embodiment, when a request is received at a cache server for data that has been invalidated, the caching system may first determine whether the present or desired performance of the system (e.g., number of open connections, average or incremental response time, throughput, etc.) allows the request to be passed to the data server that stores an updated or current version of the data. The action taken in response to the request may also depend on factors such as the popularity of the requested data (e.g., how frequently or recently it has been requested) and/or its level of invalidity (e.g., how long ago it was invalidated, the severity of invalidation, when the data was last updated or replaced). Different embodiments of the invention may weigh the operative factors differently.
For example, when a request for data that is invalid on the cache server is passed to the data server because the requested data is popular and/or highly invalid, subsequent requests for the same data may be satisfied at the cache server using an invalid version (until new data is received from the data server). Conversely, when less popular and less invalid, but still invalid, data is requested from the cache server, an invalid version of the data may be served by the cache server. In addition, however, a lower priority request for an updated or replacement version of the data may be passed to the data server in order to retrieve the newer data.
In one implementation of this alternative embodiment of the invention the currency or validity of a cached data item may depend upon factors such as: a desired level of system performance (e.g., a target response time to user data requests), one or more characteristics of the data item (e.g., how popular it is, the cost of refreshing it, how valuable the data is), and an allowable rate or number of refreshes that may be performed. Depending upon these factors, a particular data request may be satisfied from a cache memory (e.g., even if the cached data item is invalid) or from a data server or other primary storage device.