The present invention relates generally to microprocessors and, more particularly, to a shared instruction cache for multiple processors.
A microprocesor typically includes a cache memory for storing copies of recently accessed information. The cache memory is generally smaller but faster than main memory (e.g., disk). In particular, a microprocessor typically includes an instruction cache for storing recently accessed (i.e., recently used) instructions. The instruction cache is generally located on the same integrated circuit chip (or die) as the core logic of the microprocessor.
FIG. 1 is a block diagram of a prior art instruction cache subsystem of a multi-processor system 100. In particular, multi-processor system 100 includes two processors, a P1 processor 102 and a P2 processor 104. P1 processor 102 and P2 processor 104 each access a main memory 106 via a bus 108. P1 processor 102 caches recently used instructions in an instruction cache 110. P2 processor 104 caches recently used instructions in an instruction cache 112. P1 processor 102 and instruction cache 110 reside on die (chip) 114. P2 processor 104 and instruction cache 112 reside on die 116. Accordingly, prior art system 100 represents an SMP (Symmetric Multi-Processing) system that shares memory, main memory 106. Further, instruction cache 110 and instruction cache 112 typically each include two ports, a port for connecting to P1 processor 102 and P2 processor 104, respectively, and a port for connecting to main memory 106. The ports can be physical ports or logical ports.
The present invention provides a shared instruction cache for multiple processors. For example, the present invention provides a cost-effective and high performance instruction cache subsystem in a microprocessor that includes multiple processors (i.e., CPUs (Central Processing Units)).
In one embodiment, an apparatus for a microprocessor includes an instruction cache that is shared by a first processor and a second processor, a first register index base for the first processor, and a first memory address base for the first processor. The apparatus also includes a second register index base for the second processor, and a second memory address base for the second processor. On each processor, a register access is offset using the register index base (e.g., a register address specifier is concatenated with the register index base). Similarly, on each processor, a memory access is offset using the memory address base (e.g., a memory address specifier is concatenated with the memory address base). This embodiment provides a shared instruction cache for multiple processors that provides a hardware implemented segmentation of register files and main memory based on which processor is executing a particular instruction (e.g., an instruction that involves a register access or a memory access). For example, this embodiment allows a thread of a multi-threaded computer program that is executed by the first processor and the same thread that is executed by the second processor to generate register files that can later be combined, because the register index base can be set such that the execution of the same thread on the first processor and the second processor do not overlap in their register address specifiers"" usage of registers. Similarly, the same thread can be executed on the first processor and on the second processor and by setting different values in the memory address base, the data written into the main memory can be insured to not overlap such that the results of the execution of the same thread on the first processor and the second processor can subsequently be compared or combined.
Other aspects and advantages of the present invention will become apparent from the following detailed description and accompanying drawings.