Technology pertaining to processor design and manufacture has advanced such that many commercially available computing devices include multi-core processors. In the recent past, processors were developed with a single core. The processor core is the portion of the processor that performs reading and execution of instructions. More recently, multi-core processors have been developed, where a multi-core processor is composed of two or more independent cores. Typically, these processor cores are manufactured on a single integrated circuit die.
Extending upon the multi-core architecture, architectures have been proposed that include numerous processor cores (e.g., several cores on a single chip). When such a large number of processor cores are included in a particular architecture, conventional multi-core techniques for message passing, cache coherency, and the like do not scale. Cluster-on-chip (CoC) is a cluster/grid model system that is composed of complex computation, memory and I/O subsystems, all of which are interconnected through a mesh network. There are several challenges to programmability of these cluster-on-chip systems, with the primary challenge being a lack of a hardware cache coherency scheme between cores in the CoC memory typology.