The present invention relates to caches employed with a data storage and retrieval system, and more particularly to systems and methods for dynamically specifying information caching in a data storage and retrieval system.
A cache is a temporary memory storage used to store information from a typically larger and slower memory storage. Fast memory storage can be relatively expensive to implement in large quantities. Larger, slower memory storages have economic advantages, with corresponding reductions in performance. A smaller, faster memory storage used as a cache as a front end to a larger, slower memory storage can increase performance over the large memory storage alone, while not significantly impacting overall costs.
The advantages of a cache are realized by implementing the cache with a storage memory that is faster than the large memory storage. Data that is frequently accessed in the large memory storage can be temporarily stored in the cache, so that faster overall access times for the data may be realized. Typical cache implementations involve some type of fast RAM memory storage, such as dynamic RAM (DRAM), that temporarily stores data from a larger memory storage, such as static RAM (SRAM), ROM or disk storage. For example, CPUs often use one or more caches to improve data access times for instructions and data read from a main memory. Data storage and retrieval systems typically use one or more caches to temporarily store data read from, or to be written to, a mass storage device such as a disk storage device. In practice, a cache typically improves performance by storing data that is frequently accessed, such as variables that are updated regularly. The use of a cache in the form of a fast memory storage device to store frequently accessed data is typically highly effective in speeding or streamlining data access in a data storage and retrieval system.
Depending on the implementation, a given large memory storage and an associated cache can be constructed to have all read and write accesses from an entity seeking access to data on the large memory storage occur through the cache. That is, when an entity seeks access to the data on the large memory storage, the access request is applied to the cache, and then, if the data is not in the cache, to the large memory storage. Accordingly, the entity receives data from the large memory storage through the cache, rather than directly from the large memory storage. Once the data is stored in the cache, future access requests for the data receive a response directly from the cache, rather than making another request to the large memory storage.
Other cache and large memory storage relationships can be implemented as well. For example, read requests to the large memory storage when the data is not in the cache can be processed to return the read data directly from the large memory storage to the requesting entity, as well as storing a copy of the read data in the cache. Similarly, write requests can be implemented to write directly to the large memory storage when the data is not in cache, without causing the data to be fetched to the cache from the large memory storage.
In a data storage and retrieval system, large memory storage, also referred to as mass storage, is typically implemented with a disk storage device, referred to here as a “disk.” When a cache is used, as an entity requests data from the disk, the data retrieval and storage system typically first checks the cache to determine if any of the requested data is contained within the cache. If the system determines that the requested data is in the cache, the data is returned to the requesting entity directly from the cache. The determination that requested data resides within the cache is known as a “cache hit.” The percentage of requests or accesses to data that result in a cache hit is referred to as a “hit rate” or “hit ratio” for the cache. Similarly, when a request for data from disk is compared against the cache and the data is not found, the event is known as a “cache miss.” A cache miss usually calls for the data to be read from disk into the cache to provide the data to the entity making the data access request.
When a cache becomes full, or reaches a predetermined threshold of data content, new data read into the cache causes older data to be overwritten or ejected. The exercise of replacing or ejecting data from the cache is typically controlled using a replacement policy. A replacement policy is a structured operation for deciding which data in the cache should be overwritten or ejected. An example of a replacement policy is based on a least recently used (LRU) algorithm, which identifies least recently used data locations in the cache. If the data in the identified data location need not be stored on disk, the data location is marked as free and the new data read from disk overwrites the marked location. If the data identified in the data location is to be stored on disk, the data is written to disk prior to the newly requested data being read into the cache. Writing the data to disk and fetching the requested data to the cache involves two accesses to disk in the event of such a cache miss. Other types of replacement policies, sometimes referred to as cache algorithms, may be implemented. A cache algorithm can be customized based on such parameters as the size of the cache, the size of the data elements stored in the cache, latencies and throughputs of the cache and the mass storage, as well as other criteria impacting operation of the storage and retrieval system.
An entity such as a user or application program seeking to write data makes a write request, which causes the cache to be checked for the data location sought to be written. If the data locations to be written are found in the cache, the write takes place directly to the cache to take advantage of fast access for writing data. Alternately, in the event of a cache miss, the data can be written to disk without fetching the data location contents into the cache.
In some implementations, data written in the cache is subsequently written to disk after a predetermined interval of time, or in response to a predetermined set of events. For example, data written to the cache may be written to disk after a given interval of time measured from the write to the cache. The time interval helps to avoid a large disk access workload by spreading out disk accesses. Data to be written to disk may also or alternately be queued to help manage disk access workload. Data written to the cache may also be written or “committed” to disk when new data is to be read into the cache from disk, calling for a replacement of the data written to the cache. For example, if the cache becomes full, or populated to a given threshold, a data access request for data that is not in the cache may call for specified cache data to be ejected, and if need be, the ejected data is written to disk. The cache locations that have been modified are noted to ensure being later written to disk. Locations in the cache with written or modified data are often referred to as “dirty.”
Sometimes, processes are implemented to write directly to a disk and bypass the cache. A cache may contain a copy of data from disk, where the disk entry is modified by a bypassing process, in which case the data in the cache becomes “stale.” Another example of when data in a cache may become stale is when two or more caches are configured to work together. For example, caches may be used in conjunction with each other to share copies of data for multiple disks. If one copy of the data changes while another does not change, the non-changing copy can be considered to be stale data.
Typically, caches are organized with a cache manager to implement caching algorithms, track dirty data locations and communicate with other cache managers or large memory storage to maintain data consistency between caches and large memory storage. Maintaining consistency for data across different caches or a cache and a large memory storage, for example, is known as “cache coherency.”
Caches are typically configured for the system in which they are used with the intent of optimizing performance. Accordingly, cache implementations are specific to memory size and mass storage device parameters, such as latency or throughput, and are typically not designed to be flexible or scalable. A cache for a given data storage and retrieval system is usually specifically optimized for that system in terms of speed and performance, so that its operating characteristics are fixed. Because of the static nature of operational characteristics of such a cache, it is typically not capable of responding to changes in demands or changes in operating parameters in relation to the data storage and retrieval system.