In multi-core processors or other processing systems, each core may have an associated cache memory, i.e., a private cache accessible only by that core. Furthermore, a shared cache memory, accessible to all of the cores, may be provided to extend cache capacity. Cache access time can be affected by propagation delays present in electrical circuitry. In general, cache access time may increase in proportion to such physical properties as the distance between the cache and an accessing logic, the width of a data interconnect, and so forth.
The optimal cache design for a multi-core architecture is a current research issue, and one of the most basic questions is whether a large cache should be organized into a single (e.g., banked) shared cache, or into private caches for the cores. A shared last-level cache can perform poorly when the private data of each thread's working set exceeds a core's private cache and data has to be repeatedly re-fetched from a remote piece of the shared cache. A private last-level cache can perform poorly when threads share most of their working sets. Thus an optimal choice depends on the total amount of cache capacity, an application's working set size, and the application's data sharing patterns. Both shared and private access patterns are expected to occur in future recognition, data mining, and synthesis (RMS) applications, as an example.