A computing system is typically composed of hardware and software components that interact with each other. The hardware components can be described generally as segments of such computing system that are physically tangible, such as processors, memory chips, hard drives, connecting wires, traces, and the like. Moreover, such processing hardware components are constructed to recognize two logical states, namely a “0” state (or low electrical state) and a “1” state (or high electrical state). Employing a number of such states together in a sequence allows data to be stored and processed by the hardware.
Furthermore, hardware manufacturers are developing computing platforms with multiple processors—as opposed to a single processor—which can further contain multiple processing cores instead of what used to be only a single processor core. Additionally, recent trends have produced processors with multiple “logical” processors, as employed in simultaneous multi-threading, for example. Such logical processors typically share functional resources including adders, memory storage mediums and the like. Likewise, caches can now be shared between both physical and logical processors. Similarly, buses can further be implemented as shared resources for efficiency gains and/or reduction in complexity and cost. Accordingly, hardware components in a computing system are becoming more complex in their architecture, which substantially varies with each computing platform.
Moreover, with the trend towards multi-core architectures, associated systems consisting of multiple memory controllers are becoming increasingly significant. In general, each memory controller can be treated as an independent entity that performs its own decision-making. For example, core processing systems can include N cores and M memory controllers (where N, M are integers), and a “core” can include: instruction processing pipelines (integer and floating-point), instruction execution units, and the L1 instruction or data caches. For example, many general-purpose computers manufactured today resemble dual-core systems (N=2), wherein two separate, yet identical cores can exist. In multiprocessor based system architectures, cores can exist on the same or different physical chips, which may or may not be identical.
In such systems, each core can have its own private L2 cache, or alternatively the L2 can be shared between different cores. Moreover, regardless of whether or not the L2 cache is shared, the physical DRAM Memory (e.g., the memory banks in which the actual data is stored) of current multi-core systems is typically shared among all cores. Hence, memory requests from different threads executing on different cores contend for the same memory system, which can further require appropriate buffering and scheduling policies.
Moreover, the totality of a system's DRAM memory can be partitioned across multiple DRAM memory chips. Typically, a DRAM memory chip is organized into multiple banks. Each bank stores a subset of the total physical memory managed by the DRAM memory chip. An underlying concept for organizing DRAM memory chips into multiple banks is that memory requests to different banks can be serviced in parallel. For example, each DRAM bank has a two-dimensional structure, consisting of multiple rows and columns. Consecutive addresses in memory are located in consecutive columns in the same row. The size of a row varies, but it is usually between 1-32K bytes in commodity DRAMs. Efficient request scheduling to the DRAM requires sophisticated and complex scheduling decisions in order to achieve high performance.