A primary factor in the utility of a computer system is the speed at which the computer system can execute an application. It is important to have instructions and data available at least as fast as the rate at which they are needed, to prevent the computer system from idling or stalling while it waits for the instructions and/or data to be fetched from memory (e.g., main memory and caches).
Significant advances continue to be achieved in microprocessor technologies and architectures. These advances have resulted in substantial increases in processing power or speed and in the capacity of on-chip memory (e.g., caches). Increases in processing speed have been achieved by including multiple central processing unit cores (“core processors” or “cores”) on a chip. Each core processor can initiate transactions such as memory requests to read/load data from or store/write data to memory.
In modern communication networks, many applications that are performed at network nodes are executable in parallel, which makes multi-core chips particularly useful in network devices such as routers, switches, servers, and the like. The complexity and bandwidth of modern communication networks have been increasing with increasing demand for data connectivity, network-based applications, and access to the Internet. Accordingly, the number of core processors in multi-core chips has been increasing in recent years to accommodate the demand for more processing power within network devices.
However, as the number of core processors within a chip increases, managing access to corresponding on-chip memory as well as attached memory (e.g., main memory) becomes more and more challenging. For example, when multiple core processors issue memory requests simultaneously, contention can occur between requests when they are directed to the same memory component, and congestion increases in the network/system that is transporting the requests. These problems can increase latency and decrease performance.