As technology trends towards scaling down components, the number of cores/nodes within a system (e.g., a chip) is increasing rapidly. This rapid growth, however, poses design challenges such as memory hierarchy organization and Network on Chip (NoC) floor planning.
For example, chip multi-processors (CMPs) with distributed caches suffer from a cache fragmentation problem, which is imbalanced cache utilization. In other words, some caches may be over-utilized while other caches may be under-utilized. To avoid cache fragmentation, some cache sharing and collaboration techniques have been proposed. Typically, a cache that has reached its maximum capacity, can evict (i.e., transfer) some of its data to a remote cache that has extra space rather than simply discarding the data. Oftentimes, however, the data is placed at a far-away node. Consequently, it is more “expensive” to retrieve the data from the remote cache than it would be to simply retrieve the data from lower level memory. As a result, cache retrieval from far-away nodes can have a significant impact on latency, interconnect traffic, and overall energy consumption.
Another proposed technique suggests placing the data at the evicted node's home node. However, having a per-block home node statically identified (e.g., hash functions, low-order bits physical address, etc.) can lead to high average NoC latency depending on the distance of the home node from the requested cache. Consequently, performance gain can be significantly reduced while interconnect energy consumption is increased.
Thus, the inventors recognized a need in the art for a distance-aware cache collaboration architecture without incurring extraneous overhead expenses.