1. Field
The present disclosure pertains to the field of processing systems and their associated caching arrangements.
2. Description of Related Art
Improving the performance of computer or other processing systems generally improves overall throughput and/or provides a better user experience. One technique of improving the overall quantity of instructions processed in a system is to increase the number of processors in the system. Implementing multiprocessing (MP) systems, however, typically requires more than merely interconnecting processors in parallel. For example, tasks or programs may need to be divided so they can execute across parallel processing resources, memory consistency systems may be needed, etc.
As logic elements continue to shrink due to advances in fabrication technology, integrating multiple processors into a single component becomes more practical, and in fact a number of current designs implement multiple processors on a single component (a “multicore processor”). Multicore processors also typically integrate some additional cache memory in addition to any caches closely associated with each processor core, and varying techniques are used to maintain coherency across the hierarchy within the multicore processor device.
For example, in one prior art processor, a level one (L1) cache associated with each processor core is implemented as a write through cache, such that a shared level two (L2) cache receives all modifications by each L1. While using write-through is known to be inferior in performance under some circumstances compared to using a protocol such as the well known four state MESI (Modified, Exclusive, Shared, Invalid) protocol, the use of write-through eliminates the need for cross-interrogation of the L1 caches in this prior art multicore processor. Without cross-interrogation between L1 caches, no snoop bus is provided between L1 caches, and no L1-to-L1 transfers may occur. Moreover, since there is no cross communication between the L1 caches, no sharing of caching resources associated with particular processor cores occurs. Only the L2, which is not associated with any particular processor core, is shared between the separate processor cores.
In another prior art multicore processor, two L1 caches are also separated by the L2 cache. In this prior art processor, the core logic is linked directly to the L2 cache control logic and to the private L1. Thus, coherency lookups in the L1 and L2 may begin simultaneously; however, the L2 control logic separates the L1 associated with the first core from the L1 associated with the second core. Therefore, again the L1 caches private to and associated with each processor are not linked to each other. Accordingly, there is no direct cross-interrogation between L1 caches and no direct L1-to-L1 data passing, or sharing of the L1 caches between the separate cores. Only the L2, which is not associated with any particular processor core, is shared between the separate processor cores.