1. Field of the Invention
The present invention relates to computer systems and, more specifically, to a cache system in a central processing unit of a computer.
2. Description of the Prior Art
Many modern computing systems use a processor having a pipelined architecture to increase instruction throughput. In theory, pipelined processors can execute one instruction per machine cycle when a well-ordered, sequential instruction stream is being executed. This is accomplished even though the instruction itself may implicate or require a number of separate microinstructions to be executed. Pipelined processors operate by breaking up the execution of an instruction into several stages that each require one machine cycle to complete. Latency is reduced in pipelined processors by initiating the processing of a second instruction before the actual execution of the first instruction is completed. In fact, multiple instructions can be in various stages of processing at any given time. Thus, the overall instruction execution latency of the system (which, in general, can be thought of as the delay between the time a sequence of instructions is initiated, and the time it is finished executing) can be significantly reduced.
In some modern computer systems, integer and commercial instruction streams have many loads whose targets have an immediate usage in the next instruction. With higher frequency microprocessors, pipeline depth has increased such that a level one data cache (L1 Dcache) load access can be many cycles, during which time any following dependent instructions must stall. An additional small data cache, called an L0 or level zero cache, has been proposed to mitigate the longer L1 Dcache access where the L0 is typically a one cycle total lead access time cache of small size, 1-8 KB. However, in high-frequency pipelined designs, L0 caches have been fraught with problems, including: high miss rates (30-50%) from their small size and direct map nature (one-way associative), significant additional complexity of another full data cache level, high power usage due to their constant utilization, and long line fill times creating line reference trailing edge stalls. The combination of these factors, combined with extremely high-frequency deep pipelines, has led to the general abandonment of L0 caches.
Therefore, there is a need for a small cache with a short lead access time that has a low miss rate, low power usage and a short fill line time.
The disadvantages of the prior art are overcome by the present invention which, in one aspect, is a memory system for a computational circuit having a pipeline including at least one functional unit. An address generator generates a memory address. A coherent cache memory is responsive to the address generator and is addressed by the memory address. A cache directory is associated with the cache memory. The cache memory is capable of generating a cache memory output. A non-coherent directory-less associative memory is responsive to the address generator and is addressable by the memory address. The associative memory receives input data from the cache memory. The associative memory is capable of generating an associative memory output that is delivered to the functional unit. A comparison circuit compares the associative memory output to the cache memory output and asserts a miscompare signal when the associative memory output is not equal to the cache memory output.
In another aspect, the invention is a method of providing data to a functional unit of a pipeline. A coherent cache memory is addressed with a memory address, thereby generating a cache memory output. A non-coherent directory-less associative memory is addressed with the memory address, thereby generating an associative memory output. The associative memory output is delivered to the functional unit. The cache memory output is compared to the associative memory output. When the cache memory output is not identical to the associative memory output, the functional unit is disabled.
These and other aspects of the invention will become apparent from the following description of the preferred embodiments taken in conjunction with the following drawings. As would be obvious to one skilled in the art, many variations and modifications of the invention may be effected without departing from the spirit and scope of the novel concepts of the disclosure.