Modern computer systems typically employ multi-core processors that have two or more independent processing cores that read and execute program instructions. Multi-core processors are often used in systems that feature shared memory resources. Shared memory is often implemented as a large block of random access memory that can be accessed simultaneously by several different processing units or cores in a multi-processor system. Shared memory is used to provide efficient communication among the different processing units and helps prevent the need to make redundant copies of data since all processors share a single view of the data stored in the shared memory.
Memory controllers handle and service requests from multiple sources (e.g., multiple cores, processors, co-processors, and so on) to the shared memory, and these individual request streams can interfere with each other, such that certain requests are blocked while the controller is busy servicing other requests. Various different types of interference conditions are possible and common examples of interference include bank contention (e.g., a request from core 0 must wait because the target bank is busy servicing a request from core 1) and row-buffer conflicts (e.g., core 1 must close a row buffer corresponding to a page opened by core 0).
Various different solutions have been developed to try to reduce the effect of interference in shared memory systems. For example, existing memory control algorithms may attempt to improve fairness and quality of service (QoS) by monitoring high-level metrics such as a core's memory request rate, or differentiating between different processor (e.g., CPU vs. GPU) memory streams, and then accounting for these differences in its scheduling decisions, such as by changing priorities of the streams. However, these approaches use high-level metrics that essentially focus only on bandwidth utilization or metrics that are tied to bandwidth utilization. For example, existing solutions may attempt to distribute memory resources evenly to the cores based on bandwidth capacity as a way to ensure fairness. Such methods, however, do not account for the actual needs of the different cores, nor do they account for the impact that memory usage by certain cores has on the other cores. Other high level metrics, such as a high request rate by a particular core may be suggestive of higher contention, but it does not necessarily imply such a case. For example, a high request rate isolated to one or a few banks may cause fewer row-buffer conflicts than requests with a lower request rate distributed across all of the banks. Most current systems do not explicitly monitor the lower-level behavior of the memory requests at the bank-utilization or row-buffer conflict level, and this information is typically much more useful in identifying particular contention problems, associating problems with specific elements, and providing indications of optimal solutions to the problems than the common high level metrics and bandwidth oriented approaches. Although some prior approaches to interference issues have considered certain finer-grained metrics, they generally do not consider the direct interaction between cores. For example, Thread Cluster Memory (TCM) scheduling systems monitor bank-level parallelism (BLP) per core, but do not track whether or how a higher level of BLP impacts the performance of other cores in the processor.
Furthermore, memory controllers typically do not have a way to track how much interference is caused by the various cores in the system, and thus their ability to take these effects into account when scheduling memory requests is limited or non-existent. To the extent that memory usage is tracked, present solutions generally limit the usage of any tracked information to the memory scheduler itself. The memory scheduler may make different decisions based on the tracked information, but the rest of the system is oblivious to any contention and interference issues in the memory controller and main memory. This can result in poor memory scheduling decisions leading to reduced performance, reduced throughput, reduced fairness/QoS, and possibly increased power consumption/decreased energy efficiency.
The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches.