1. Field of the Invention
The present invention concerns computer memory systems, and in particular, a system with redundant data caches for preventing data loss should one cache fail.
2. Related Art
Disk drive based memory affords large amounts of storage capacity at a relatively low cost. Unfortunately, access to disk drive memory is slow relative to the processing speed of modern microprocessors. A cost effective, prior art solution to this problem provides a cache memory between the processor and the disk memory system. The storage capacity of the cache memory is relatively small, but can be used to provide high speed access to the data.
The operating principle of the disk cache memory is the same as that of a central processing unit (or CPU) cache. More specifically, the first time an instruction or data location is addressed, it must be accessed from the lower speed disk memory. Subsequent accesses to the same instruction or data are done via the faster cache memory, thereby minimizing access time and enhancing overall system performance. However, since the storage capacity of the cache is limited, and typically is much smaller than the storage capacity of the disk storage, the cache often is filled and some of its contents must be changed as new instructions or data are accessed from the disk storage. The cache is managed, in various ways, such that it stores the instruction or data most likely to be needed at a given time. When the cache is accessed and contains the requested data, a cache "hit" occurs. Otherwise, if the cache does not contain the requested data, a cache "miss" occurs. Thus, the cache contents are typically managed in an attempt to maximize the cache hit-to-miss ratio.
FIG. 1 illustrates a high level block diagram of a conventional disk array controller 104 arranged between a host 102 and a disk storage array 106. The host computer 102 may include a processor 114, a memory 116, and an input/output interface 118 sharing a bus 112. The memory 116 may include a program storage section for storing program instructions for execution by the processor 114. The input/output device 118 may use a standard communications protocol, such as the Small Computer System Interface (or "SCSI") protocol for example, to facilitate communication with peripheral devices. The disk array 106 may include an array of magnetic or optical disks 132 for example.
The disk array controller 104 includes a controller device 124, a cache 126, and input/output interface(s) 128 and 130, which share a bus 122. The controller 124, which may be an application specific integrated circuit (or "ASIC") or a processor executing stored instructions for example, controls reading from and writing to the cache 126 and the disk array 106. The input/output interface 128 may use the SCSI protocol to facilitate communication between it and the host computer 102. Similarly, the input/output interface 130 may use the SCSI protocol to facilitate communication between it and the disk array 106.
The conventional system of FIG. 1 operates as follows. If the host computer 102 issues a read command to the disk array controller 104 and if the information requested is in the cache 126, the controller 124 forwards the requested information to the host computer 102 and a disk access is not necessary. To reiterate, this is known as a cache "hit". If, on the other hand, the information requested is not in the cache 126, the controller 124 retrieves the requested information from the disk array 106 and forwards it to both the cache 126 and the host computer 102. To reiterate, this is known as a cache "miss".
Certain systems include redundant devices to increase reliability. Such systems may include redundant processors and/or redundant data storage. For example, a system discussed in U.S. Pat. No. 5,548,711 (hereinafter referred to as "the Brant et al patent") includes redundant cache controllers. (See e.g., FIG. 4 of the Brant et al patent.) A significant challenge in systems with redundant caches is maintaining "cache coherency" without adversely affecting system performance. Cache coherency permits continued operation without data loss should one of the caches fail.
One prior art solution to the problem of maintaining cache coherency, was to maintain identical caches in each of the subsystems. There are two different ways to maintain identical caches. In the first cache coherency maintenance solution, the entire cache may be periodically transmitted from the main cache to each of the remaining redundant cache(s). In the second cache coherency maintenance solution, each time a cache operation takes place on one cache, the redundant cache(s) is (are) notified of the operation and sent any corresponding data. Then, each of the redundant cache(s) is (are) updated. These two implementations have an obvious problem; the overhead associated with the data transmissions counteracts the benefits of having a cache. The first solution has an additional problem; the cache is not coherent at all times. That is, the data is vulnerable to loss during the period between transmissions.