1. Technical Field
This application generally relates to caching, and more particularly to cache management as may be used in a data storage system.
2. Description of Related Art
Computer systems may include different resources used by one or more host processors. Resources and host processors in a computer system may be interconnected by one or more communication connections. These resources may include, for example, data storage devices such as the Symmetrix™ family of data storage systems manufactured by EMC Corporation. These data storage systems may be coupled to one or more host processors and provide storage services to each host processor. An example data storage system may include one or more data storage devices, such as those of the Symmetrix™ family, that are connected together and may be used to provide common data storage for one or more host processors in a computer system.
A host processor may perform a variety of data processing tasks and operations using the data storage system. For example, a host processor may perform basic system I/O operations in connection with data requests, such as data read and write operations.
Host processor systems may store and retrieve data using a storage device containing a plurality of host interface units, disk drives, and disk interface units. Such storage devices are provided, for example, by EMC Corporation of Hopkinton, Mass. and disclosed in U.S. Pat. No. 5,206,939 to Yanai et al., U.S. Pat. No. 5,778,394 to Galtzur et al., U.S. Pat. No. 5,845,147 to Vishlitzky et al., and U.S. Pat. No. 5,857,208 to Ofek. The host systems access the storage device through a plurality of channels provided therewith. Host systems provide data and access control information through the channels to the storage device and storage device provides data to the host systems also through the channels. The host systems do not address the disk drives of the storage device directly, but rather, access what appears to the host systems as a plurality of logical disk units. The logical disk units may or may nor correspond to the actual disk drives. Allowing multiple host systems to access the single storage device unit allows the host systems to share data stored therein.
Performance of a storage system may be improved by using a cache. In the case of a disk drive system, the cache may be implemented using a block of semiconductor memory that has a relatively lower data access time than the disk drive. Data that is accessed is advantageously moved from the disk drives to the cache so that the second and subsequent accesses to the data may be made to the cache rather than to the disk drives. Data that has not been accessed recently may be removed from the cache to make room for new data. Often such cache accesses are transparent to the host system requesting the data.
Data may be stored in a cache in order to increase efficiency. However, there can be a cost associated with performing cache management operations, such as storing and retrieving data from the cache.
One technique for implementing a cache is to store the data in blocks and link each of the blocks together in a doubly linked ring list referred to herein as a replacement queue. Each block of the replacement queue represents a block of data from a logical disk unit. The blocks or slots are placed in the doubly linked ring list in the order in which they are retrieved from the disk. A pointer may point to the block that was most recently added to the list. Thus, when a new block is to be added to the cache within the replacement queue, the structure of the replacement queue, in combination with the head pointer, may be used to determine the oldest block in the replacement queue that is to be removed to make room for the new block. An implementation of the replacement queue may use both a “head” pointer and a “tail” pointer identifying, respectively, the beginning and end of the replacement queue. The “tail” may determine the oldest block or slot in the replacement queue. Two such pointers may be used in an replacement queue arrangement as it may be desirable in accordance with cache management schemes in which some data may remain permanently in the cache and the “oldest” and “newest” data may not be adjacent to one another.
Cache management techniques are described, for example, in issued U.S. Pat. No. 5,381,539, Jan. 10, 1995, entitled “System and Method for Dynamically Controlling Cache Management”, Yanai et al., assigned to EMC Corporation of Hopkinton, Mass., which is herein incorporated by reference, in which a data storage system has a cache controlled by parameters including: (a) a minimum number of data storage elements which must be retrieved and stored in cache memory and used by the system before the cache management system recognizes a sequential data access in progress; (b) the maximum number of tracks or data records which the cache management system is to prefetch ahead; and (c) the maximum number of sequential data elements to be stored in cache before the memory containing the previously used tracks or data records are reused or recycled and new data written to these locations. The cache memory is in a least-recently used circular configuration in which the cache management system overwrites or recycles the oldest or least recently used memory location. The cache manager provides monitoring and dynamic adjustment of the foregoing parameters.
Described in issued U.S. Pat. No. 5,592,432, Jan. 7, 1997, entitled “Cache Management System Using Time Stamping for Replacement Queue”, Vishlitzky et al., which is herein incorporated by reference, is a system that includes a cache directory listing data elements in a cache memory and a cache manager memory including a replacement queue and data structures. A cache manager determines which data element should be removed or replaced in the cache memory based on the elapsed time the data element has been in the memory. If the elapsed time is less than a predetermined threshold, the data element will be maintained in the same location in the replacement queue saving a number of cache management operations. The predetermined threshold is established as the average fall through time (FTT) of prior data elements in the memory. A modified least-recently-used replacement procedure uses time stamps indicating real or relative time when a non-write-pending data element was promoted to the tail of the replacement queue, the most-recently used position. Also disclosed is another embodiment in which the number of times the data element is accessed while in the memory is compared to a fixed number. If the data element has been accessed more than the fixed number, it is placed at the tail of the replacement queue ensuring a longer period for the data element in the memory.
Described in U.S. Pat. No. 5,206,939, Apr. 27, 1993, entitled “System and Method for Disk Mapping and Retrieval”, Yanai et al, which is herein incorporated by reference, is a device-by-device cache index/directory used in disk mapping and data retrieval.
Different techniques may be used to manage the cache. In some implementations, cache management operations may be costly in terms of computer resources, such as execution time and memory. For example, it may costly to determine an available slot for use. An executing processor may make multiple attempts at different slots before locating one which can be used to store new data in the cache. There are also costs associated with inserting and removing an element from the cache that may vary with the particular cache implementation and associated data structures.
Thus, it may be desirable to provide an efficient cache management technique which reduces the costs associated with cache management. The reduction may be produced as a result of more efficient processing of one or more individual operations. The reduction may also be produced by reducing the frequency with which any one or more cache management operations are performed in cache operation.