The present invention relates generally to multi-core chips having multiple parent cores and a scout core, and more specifically, to prefetching for multiple parent cores in a multi-core chip.
Single thread processor performance growth has been limited due to power requirements needed for single thread performance. Doubling the power requirements of a processor through increased frequency and/or functional features does not necessarily yield a performance gain greater than or equal to the increased power requirement. This is because the performance gain to power gain ratio is significantly skewed. To provide chip performance growth, significant portions of the power budget may be devoted to placing additional cores on a chip. While cache and memory sharing prevents the performance increase from being equal to the ratio increase in the number of cores, the performance gain for increasing a core count on the chip may yield a greater performance/watt gain than solely improving the performance of a single core processor.
In one approach to enhance single thread performance, a secondary core on the same chip as a primary or parent core may be leveraged as a scout core. Specifically, the scout core may be used to prefetch data from a shared cache into the parent core's private cache. This approach may be especially useful in the event the parent core encounters a cache miss. A cache miss occurs when a particular line of data causes a search of a directory of the parent core, and the requested line of cache is not present. One typical approach to obtain the missing cache line is to initiate a fetch operation to a higher level of cache. The scout core provides a mechanism that is used to prefetch data needed by the parent core.
Sometimes the chip may include multiple parent cores that cooperate together to execute various tasks. Specifically, for example, in a multi-threaded environment the parent cores may be working together on a similar task. Alternatively, in another approach one of the parent cores may be working on a task, and then handing off the task to another parent core. In both cases, the cache miss behavior on one of the parent cores may be correlated to the content worked on by another parent core.