1. Field of the Invention
This invention relates generally to hierarchical caching of data and particularly to selective purging of duplicate cache entries for Direct Access Storage Device (DASD) subsystems.
2. Description of the Related Art
Modem high-performance data processors use a private high-speed hardware-managed buffer memory in front of the main data store to reduce average memory access delay at the Central Processing Unit (CPU). This high-speed buffer is denominated a "cache" because it is usually transparent to the applications programmer. Because hardware speed is generally directly proportional to hardware cost, the cached memory features can be cost-effectively improved by adding another faster cache in front of the first cache if made smaller. Such multilevel cache "hierarchies" are known in the art to give rise to a requirement for "coherence management" in shared memory multiprocessing configurations because each CPU is directly coupled only to its private cache. That is, the temporary contents of many separate private cache buffers must be somehow coordinated to ensure that only the most recent record copies are committed to the underlying main data store.
An analogous problem arises in systems that employ multilevel data storage subsystems. For instance, a modem shared-storage multiprocessing system may include a plurality of host processors coupled through several cache buffer levels to a hierarchical data store that includes a random access memory level followed by one or more larger, slower storage levels such as Direct Access Storage Device (DASD) and tape library subsystems. Transfer of data up and down such a multilevel shared-storage hierarchy requires data transfer controllers at each level to optimize overall transfer efficiency.
The IBM 3990 storage controller is an example of a storage controller used to control data transfer between DASD-based storage libraries and host computer processors. This storage controller includes a local cache memory for buffering data transfers to and from the underlying DASD storage subsystem. The IBM 3990 storage control subsystem is fully described in "IBM 3990 Storage Control Planning, Installation and Storage Administration Guide" (IBM document GA32-0100-04, International Business Machines Corporation, copyright 1991) and in "IBM 3990 Storage Control Introduction" (IBM document GA32-0098-0, International Business Machines Corporation, copyright 1987). Both of these documents are fully incorporated herein by this reference.
A typical (IBM 3990 Model 3) storage controller handles up to 16 channels from host computers and up to 64 logical DASDs. Within the storage controller are two multipath storage directors and four storage paths, two of which are associated with each multipath storage director. Each multipath storage director may be connected to up to eight incoming channels from host computers, for a total of 16 channels. Thus, each multipath storage director functions as an eight-by-two switch.
Recent advances in DASD storage library art include exploitation of the Redundant Arrays of Inexpensive Disks (RAID) technology now well-known in the art. RAID DASD technology has led to development of a DASD storage system rack incorporating a plurality of cached DASD modules each organized to emulate logical DASD storage volumes. Each module includes a high-speed cache buffer memory for facilitating data transfers between a specific plurality of DASDs and a channel to the adjacent storage controller. Such a module is herein denominated a Cached Storage Drawer (CSD) subsystem.
As is known in the art, channels are physical links between a host computer processor and an external device, such as a DASD data storage subsystem. Usually, a host computer has a small number of channels, each physically connected to channel control multiplexers such as the IBM 3990 storage controller. For instance, several host computer processors may be connected to one IBM 3990-3 storage controller, which in turn is connected to sixty-four DASD volumes. When transferring data, the storage controller can secure any one of the plurality of channels and storage paths back to the host computer and forward to the DASD to establish a temporary input/output transaction data path. It is a feature of the IBM 3990 storage controller that such a data path between a host computer and a DASD subsystem may be severed into two separate connection intervals, each of which may be handled over a different physical channel and storage path. That is, a DASD access request need not be answered over the same channel on which it is received. This feature increases storage controller efficiency because the storage controller is free to handle other tasks during the disconnect interval between request and response.
The independent development of a new CSD RAID type of DASD subsystem and a distributed host processor storage controller has given rise to a new variation of the cache hierarchy architecture known in the art. The IBM 3990 type of storage controller provides a cache buffer memory to support data transfer between host computer and DASD-based storage subsystem. The CSD subsystem provides internal cache buffer memory to support data transfers in and out of the RAID plurality of DASDs. Thus, connecting the IBM 3990 type of storage controller to a CSD storage system creates an unplanned dual-cache hierarchy comprising the storage controller cache and the CSD cache. Each of these two cache memories is independently managed for different purposes, including the aging and demotion of cache entries according to a Least Recently Used (LRU) priority scheme and the like. This unplanned duplication presents novel problems and opportunities heretofore unknown in the hierarchical cache art.
Cache memory is best known for its application as an adjunct to random-access memory (RAM) where the cache buffer provides high-speed storage for frequently-used instructions and data. Practitioners in the art have proposed many important improvements to multiple-cache hierarchies employed in distributed multi-processor systems. The fundamental distributed system cache management problem is to optimize the tension between minimizing cross-interrogation overhead and maximizing cache coherency. Processor time is required to cross-interrogate individual caches when searching for duplicate copies of cache data blocks to ensure that all copies but the latest are flushed from every cache. This cleaning operation provides cache "coherency", which exists when each of the plurality of host processors has access only to the latest version of a cached data line or block. The struggle for coherency in distributed systems invites increased cross-interrogation processor overhead and many practitioners have proposed improvements to reduce cross-interrogation without reducing coherency.
For instance, in U.S. Pat. No. 4,574,346, Hartung proposes marking cache data lines for retention or discard depending on whether the data line has a "transient" status or "permanent" status. Transient data lines exist only temporarily and are never transferred to storage levels underlying cache. This arrangement eliminates cross-interrogation overhead for the "transient" data lines. Similarly, in U.S. Pat. No. 4,885,680, Anthony et al. propose marking data that is temporarily cachable to facilitate the efficient management of that data in cache. When an "invalidate marked data" instruction is received, the cache controls sweep through the entire cache directory and invalidate all marked cache lines in a single pass, thereby eliminating the usual cache coherency overhead.
Similarly, in U.S. Pat. No. 5,130,922, Liu proposes adding status bits in the cache directory so that cache "exclusive" status (which locks the cache entry for a single processor) can be anticipated without incurring performance penalties when the exclusive assignment is inappropriate.
In U.S. Pat. No. 4,442,487, Fletcher et al. add two flags to the directory entry that serve to communicate from main memory to private and shared caches how the given page of data is to be used. Essentially, pages that can be both written and shared are moved from main memory to a shared level-two cache and therefrom to a shared level-one cache, with the host processors executing only from the shared level-one cache. All other pages are moved from main memory to private level-two and level-one caches for the requesting processor. Thus, Fletcher et al. permit a processor to execute from either its private cache or the shared level-one cache, thereby allowing several processors to share an underlying main memory without encountering cross-interrogation overhead. The cost of this feature is voluminous cache space.
In U.S. Pat. No. 4,471,429, Porter et al. disclose a cache clearing system that uses a duplicate directory to reflect the contents of the cache directory within its associated cache unit. Commands affecting information segments within the main memory are transferred by the system controller unit to each of the duplicate directories to determine if the affected information segment is stored in the associated cache memory and, if so, the duplicate directory issues a "clear" command through the system controller to clear the information segment from the associated cache unit, thereby improving cache flushing efficiency.
Also, in U.S. Pat. No. 4,322,795, Lange et al. disclose a similar duplicate directory arrangement for selective clearing of the cache in multiprocessor systems where data in a cache becomes obsolete because of changes made to the corresponding data in main memory by another processor. Lange et al. teach a LRU scheme for selecting a storage location for data retrieved from main memory responsive to a cache miss. This scheme provides a higher cache hit ratio, thereby improving flushing efficiency available from the duplicate directory arrangement.
It is clear from these references that the present art focuses primarily on the multiple independent cache coherency problem and neither teaches nor suggests schemes for exploiting two independently-managed high-speed cache buffer memories that are hierarchically connected. When a CSD data storage library subsystem is coupled to a plurality of distributed host processors through one or more cached storage controllers, there is a clearly-felt need in the art for a hierarchical cache management technique that offers improved caching efficiency through reduced duplication of cached data blocks. The related unresolved deficiencies are clearly felt in the art and are solved by this invention in the manner described below.