1. Field of the Invention
The present invention relates to the field of computer systems and associated cache memory structures. More particularly, the present invention relates to a cache controller and associated registers to maintain cache consistency during multiple overlapping cache access operations.
2. Art Background
Typically a central processing unit (CPU) in a computer system operates at a substantially faster speed than main memory. When the CPU executes instructions faster than memory can supply them, the CPU must idle until the next instruction datum upon which the instruction will operate is available. CPU idle time adversely affects system performance. To avoid unnecessary CPU idle time while awaiting data or instructions from the large main memory, a smaller cache memory capable of operating at a higher speed than the main memory is often used to buffer the data and the instructions between the main memory and the CPU. The data and instructions in memory locations of the main memory are mapped into the cache memory in block frames. Each block frame consists of a block offset corresponding to a number of memory locations storing data and instructions associated with that block. Access to a block frame of the cache is typically made via a cache directory storing physical address tags and status bits corresponding to the respective block frames.
When a cache read "miss" occurs, that is, when the datum or instruction requested by the CPU is not in the cache memory, the cache memory must retrieve the datum or instruction from the main memory. To do so, typically the entire block frame of data or instructions including the requested datum or instruction is retrieved, and the CPU idles until the entire block frame retrieval is completed. Many other cache performance problems and improvement techniques exist, the reader being referred to, for example, J. L. Hennessy and D. A. Patterson, Computer Architecture--A Quantitative Approach, pp. 454-61, (Morgan Kaufmann, 1990).
More recently, computer systems having multiple processors have become common, directed to increasing processing speed. In a multiple processor system, some or all of the several processors may simultaneously attempt to access the block frames stored in the cache, either for read or write purposes, and directing that data be routed to or from any of various input/output (I/O) devices. In a multiple processor (MP) system, proper system operation depends on maintaining proper correspondence of data stored in the cache with the corresponding processor, where any of several processors may access and alter cache-stored data. Correspondence of data to the proper processor is termed "cache consistency".
Previously, cache consistency in MP systems typically has been guaranteed by providing a duplicate copy of the cache directory. The duplicate directory is normally used to enable a processor on a bus interconnected to multiple processors to access information in the duplicate of the cache directory during a snoop operation without requiring access to the cache directory itself. However, with increasing cache size and, thus, increasing cache directory size, maintaining a duplicate copy of the directory can become costly in terms of actual cost and performance of the computer system.
Thus, it is desirable to provide a new approach for controlling a cache memory structure to maintain cache consistency in a computer system having multiple processors issuing multiple outstanding read and write operations in an overlapping, substantially contemporaneous fashion. It is particularly desirable if cache miss penalties are thereby reduced. It is also desirable if the hardware requirements necessary to implement the cache controller and associated control registers can be minimized.
As will be described in the following detailed description, these objects and desired results are among the objects and desired results of the present invention which overcomes the disadvantages of the prior art, and provides methods and cache memory controller for implementing a cache memory system for fetching data for a multiple-CPU computer system. The present invention reduces SRAM-intensive cache structures, while maintaining data consistency between caches and CPUs.