The present invention relates to computer systems including cache memories improved to increase reliability.
Computer systems include a processor and a memory for holding instructions and data for processing by the processor. In order to decrease the latency time of memory accesses, computer systems often use a known technique known as a cache memory. In a computer system utilizing cache memory, a main memory, which holds all the instructions and data for the processor, is coupled to the processor over a system bus; but a smaller, faster memory is coupled to the processor over a fast local bus. The cache memory holds a subset of the data stored in the main memory.
If the processor requests data at an address which is in the cache memory, called a cache hit, then the request may be granted in a much shorter time because the cache memory itself operates faster than the main memory, and because it is coupled to the processor over the local bus which operates faster than the system bus. Only if the address of the requested data is not in the cache memory, called a cache miss, is the memory request forwarded to the main memory, which operates slower than the cache memory, and is coupled to the processor over the slower system bus. The actual increase in speed resulting from use of a cache memory depends upon the ratio of the number of memory accesses which are filled from the cache memory to the total number of memory accesses, called the hit-ratio. In order to maximize the hit-ratio, when one piece of data is transferred from the main memory to the cache memory, some further amount of data from addresses in the neighborhood of that of the requested piece of data, called a block, is transferred to the cache memory at the same time, a process known as cache fill.
The increase in access speed resulting from use of a cache memory is even more if the computer system is a multiprocessor computer system. A multiprocessor computer system consists of several processor modules, each including a processor, which share a single main memory. All of the processor modules must share the system bus, and if it is busy, the processors which have a bus request must wait until it is free. This imposes yet another delay in satisfying a memory request which is a cache miss. In such a computer system, each processor module may include its own cache memory.
In some computer systems, the cache memory in each processor module is configured as a write-back cache. In a write-back cache, when a request to write data is processed, the main memory block into which the data is to be written is transferred to the cache memory of the processor module from main memory, and that block in the cache RAMs is marked in the main memory as being "owned" by that processor module. No other module is allowed to write to that block. Subsequent writes to that block take place within the cache memory only. This decreases the system bus accesses, thus, decreasing memory access time. However, the cache memory of that particular processor module contains the only accurate copy of that block. Because the cache memory may contain the only accurate copy of memory data, it is important that the data in the cache memory, and access to it, be protected as much as possible.
A cache memory includes random access memories (RAMs) for containing the data in the cache memory, and a controller for controlling the cache memory. The cache RAMs are divided into a number of blocks, each of which may contain a block of data from the main memory. The cache controller keeps track of which main memory blocks are currently in the cache RAMs by maintaining a storage device which includes one location for each block in the cache RAMs. Each location in the storage device contains a first portion, called a tag, which identifies which main memory block is in the corresponding block; and a second portion which contains the status of that block in the cache RAMs. For example, each block in the cache RAMs may be valid or invalid, or may be writable (called dirty) or read-only. Because this storage device contains tags identifying which main memory blocks are in which blocks in the cache RAMs, this device is called a tag store.
The processor requests memory accesses by sending the main memory address of the desired data to the cache controller. The cache controller checks the tag store to determine whether the desired main memory address is stored in the cache RAMs and whether the block in the cache RAMs is valid. If the request is for a memory write, the cache controller also checks the tag store to determine if the block in the cache RAMs is dirty (writable). If the addressed data is in the cache RAMs and valid (and dirty for a write access) then the cache controller issues the proper signals to the cache RAMs to make the requested data transfer. If the desired data is not in the cache RAMs, or if the block is not valid (or not valid and dirty for a write access), then the cache controller requests the desired data from the main memory, sends the desired data to the processor when it is available, fills the remainder of the block in the cache RAMs, and updates the tag store.
In a multiprocessor computer system, it is necessary for all the cache memories to contain accurate information. This entails keeping track of the main memory accesses on the system bus. For example, if a processor module owns a block (i.e. has write privileges) and another processor module requests a read from or write to that block, then the first processor module must write-back that block into main memory so the second processor module may have access to it, and mark that block in the cache RAMs as being not valid and not dirty. Alternatively, if the first processor module has a read-only copy of a block, and a second processor module requests a write to that block, then that block in the cache RAMs must be marked invalid. The processor module includes circuitry to monitor the memory requests on the system bus and to check each one in the tag store in the cache controller to determine whether a write-back or invalidate must be performed on the block.
Some write-back cache controllers subdivide the blocks into subblocks, each of which have different write privileges. For example, each block may be divided into four subblocks. In order to maintain the status of these subblocks, each location in the tag store would maintain four sets of status indicators, one for each of the subblocks in that block. In such a cache memory, only the dirty subblock must be written back upon a request for an address in that block by a different processor module.
In order to enhance the reliability of access to the data, cache memories include some way of protecting access to the tag store. One method used is to include some error detection coding in the tag store. For example, parity bits may be included in each location in the tag store. Also, if there are multiple status bits, a parity bit may also be appended to the status bits. Whenever a new tag and status bits are written, the parity bits are generated. Whenever the tag and status bits are accessed, the parity of the accessed data is checked. As long as the parity is correct, there is no change in the operation of the cache memory. If it is incorrect, then it is not possible to accurately determine which main memory block is in the corresponding block in the cache RAMs. In this situation, an error is reported and the cache memory alters its operation. The processor may initiate an error recovery program to diagnose and correct the tag store problem in response to the error signal. In addition, the cache memory may partially turn off. For example, all memory requests may be treated as cache misses (requiring direct access to the main memory) except for those accesses to dirty blocks. Because dirty blocks contain the only accurate copy of that data, the cache memory must continue to satisfy requests to dirty blocks. U.S. patent application Ser. No. 07/547,597, filed Jun. 29, 1990, entitled ERROR TRANSITION MODE FOR MULTIPROCESSOR SYSTEM, by Stamm et al., describes a method and apparatus for implementing a write-back cache memory system in a multiprocessor computer system.
In the case of the tag data, when a new tag is written into the tag store, the parity must be generated over the entire new tag because there is no relationship between the previous contents of the tag portion of that location and the new contents. However, in the case of the a multibit status portion of the that location, only a subset of the status bits change at any time.
As described above, for any subblock, there is one valid bit and one dirty bit. If the valid bit is a logic `0` signal, then the subblock is invalid. If the valid bit is a logic `1` signal, and the dirty bit is a logic `0` signal, then the block is valid and read-only. If the valid bit is a logic `1` signal and the dirty bit is a logic `1` signal, then the block is valid and writable. Only one of four transactions can be performed on a subblock of cache memory: `make valid`, `make valid and dirty`, `make not valid` and `make not valid and not dirty`. Thus, for any particular transaction on the tag store, only two status bits are changing at any one time.
In order to decrease the latency time of the cache memory, it is desireable to decrease the time needed to update the parity bit when the status of a subblock is changed in the cache memory.