Advances in semi-conductor processing and logic design have permitted an increase in the amount of logic that may be present on integrated circuit devices. As a result, computer system configurations have evolved from a single or multiple integrated circuits in a system to multiple cores, multiple hardware threads, and multiple logical processors present on individual integrated circuits. A processor or integrated circuit typically includes a single physical processor die, where the processor die may include any number of cores, hardware threads, or logical processors. The ever increasing number of processing elements—cores, hardware threads, and logical processors—on integrated circuits enables more tasks to be accomplished in parallel. However, the execution of more threads and tasks put an increased premium on shared resources, such as memory/cache and the management thereof.
Traditional chip multiprocessor (CMP) architectures incorporate a shared and distributed cache structure. Cache slices of a cache may be co-located with respective cores on a CMP because physically closest cache involves quicker access times. All cores on a CMP, however, usually have access to all cache slices on the CMP. An address maps to a unique cache slice based on an address hash function, which determines cache ownership for the address. While beneficial to optimize cache usage in a flexible manner, sharing distributed cache by multiple cores creates coherency challenges. As just one example, if a change in an address hash function of portions of cache lines is triggered (e.g., due to powering down certain cores and co-located cache lines due to low utilization), then the CMP is quiesced and all affected cache addresses are flushed. This action causes a significant negative impact on performance, as the CMP determines affected cache lines and executes the flush, causing delays in the order of milliseconds. Thus, there is a need to more efficiently handle hash changes in CMPs with distributed cache architectures.