Historically, the demands of microprocessor technology have been increasing at a faster rate than the support technologies, such as dynamic random access memory (DRAM) and programmable transistor-transistor-logic (TTL). Recent tends are further aggravating this mismatch in the following ways. First, microprocessor clock rates are rapidly approaching, and in some cases exceeding, the clock rates of standard support logic. In addition, the clocks per instruction rate is rapidly decreasing, putting a very high bandwidth demand on memory. Newer designs such as reduced-instruction-set-computer (RISC) architectures, are demanding evermore memory bandwidth to accomplish the same amount of work. The memory bandwidth demand has been further aggravated by the need for direct memory access (DMA) by devices such as co-processors and multi-processors. Finally, the rate at which new devices are being introduced into the market place is accelerating--further exacerbating all of the above.
As a result of these trends, two severe performance bottlenecks have emerged that continue to influence the way that systems are designed. Memory bandwidth, as a performance limiter, has already forced the use of cache memories in many microprocessors systems. By way of example, the use of cache memories is commonplace in the 80386.TM. generation of microprocessors manufactured by Intel Corporation. Also, Intel's 80486.TM., i860.TM. and i860XP.TM. microprocessors include on-chip caches for enhanced performance. It is clear that further changes in the memory hierarchy (primary cache, secondary cache, DRAM architectures, etc.) will be required to sustain performance increases in future generations. (Note that "Intel", "80386", "80486", "i860" and "i860XP" are all trademarks of Intel Corporation.)
Another performance bottleneck is the clock rate and input/output (I/O) timings. It has become apparent that the investment required to continue increasing the microprocessor clock rate (and the resulting I/O timings) cannot be sustained across all components in the system. Even if one could afford the investment, the schedule impact of treadmilling, coordinating and the potentiality for multi-vendors, could easily make such an architecture non-competitive. These factors have already forced the use of asynchronous interfaces to isolate the frequency scaling problem to a subset of the system components. In the future, it is clear that high speed CPU interfaces will need to be designed around an even more tightly controlled specification in order to reach the desired level of performance.
Typical of the drawbacks characteristic of past approaches is the inability to support concurrent operations at both the CPU and memory interfaces. That is, for every access to the read/write storage array, only one piece of data gets transferred. This means that the cache static-random-access-memory (SRAM) array needs to be accessed repeatedly to obtain each piece of the cache line, blocking access from the other interface. Alternatively, a wide bank of SRAMs could be employed along with corresponding external multiplexers, but only at the considerable expense of additional complexity and cost.
Another common drawback of prior art cache memories is that every transfer is required to be synchronized. In other words, before data arriving from the memory bus can be transferred to the CPU bus, a handshake must occur synchronous with the microprocessor clock. This process must be repeated for each data transfer from the memory bus. Note that this is simply another way of stating that the transfer of data between the memory and CPU buses requires synchronous operation. Such operation presents a serious burden on the computer system's performance, especially with increased CPU clock rates.
As will be seen, the present invention discloses an integrated cache memory employed within a CPU/cache core architecture that is intended to overcome the performance bottlenecks described above. When utilized in conjunction with microprocessors such as the 80486, the numerous features of the invented cache solution is capable of linearly scaling the performance of these CPUs to previously unrealized speeds (e.g., &gt;50 MHz).