Multi-core architectures have recently attracted substantial attention, because of the increasing difficulty to push processor core speeds beyond the few GHz mark already reached some years back. Therefore the computer devices industry has recently focused on instantiating the same processor core multiple times (dual-core, quad-core) and improving communication mechanisms between multiple cores. In contrast, the consumer devices industry has always looked at heterogeneous compute platforms that utilize a mix of industry-standard CPU, fixed-point DSP, VLIW, and function-specific HW cores, an example being the Nexperia™ platform [see also the following publications: “S. Dutta et al. Viper: A multiprocessor SOC for Advanced Set-Top Box and Digital TV Systems. IEEE Design & Test of Computers, September-October 2001, pages 21-31”; and “Claasen, T.A.C.M.: System on a chip: changing IC design today and in the future, Micro, IEEE, Volume 23, Issue 3, May-June 2003, pages 20-26”]. An important advantage of the heterogeneous platform approach is that algorithms can be executed on the processor core that is best suited for them. Functional subsystems, consisting of several co-operating algorithms, are implemented on a single processor core, possibly supported by function-specific HW cores. The functional subsystems have well-defined communication interfaces, which make debug and system integration effort low. Recent advances in CMOS technology allow integration of an ever growing number of processor cores on a single die. This high level of integration offers a cost reduction, whilst at the same time increasing competition for usage of scarce shared HW resources.
A common architecture for a System-on-Chip (SoC) is one where there are several agents (IP cores, IP blocks, functional blocks, etc) that access the shared memory (for example a DRAM or an SDRAM) via a memory controller. In such architecture the memory controller arbitrates between requests (transactions with the memory) from the different agents. In certain SoCs from NXP the requests are split up into two categories, low-latency (LL) requests and constant-bandwidth requests (CB). In those SoCs, the CB-requests are guaranteed a limited latency and a constant transaction rate by an accounting mechanism, i.e. in order to do so a so-called CB-account has to be maintained, which keeps track of the latitude with respect to the latency-rate guarantee of the CB stream. The LL-requests have the highest priority and the CB-requests are serviced when there are no LL-requests. When the CB-account reaches a certain threshold value (boost value), which indicates that the guarantee is about to be violated, the LL-requests are blocked and the CB-requests get serviced. In that way the CB-requests get the guaranteed maximum latency and a minimum rate (bandwidth). This is implemented in the IP2032 memory controller, used in several SoCs from NXP. It is also included in the IP2035 memory controller.
The memory that is shared amongst the plurality of agents is generally a volatile memory (DRAM, SDRAM) that requires refresh commands. A further context of the invention is that the memory is a separate chip or chipset. This implies certain problems and limitations to be overcome. The access path is via pins, which are very costly in terms of area, etc. The limited number of pins reduces the available memory access bandwidth to a level which is barely enough for the memory requirements. This makes the memory bandwidth a system bottleneck. The memory is shared by the agents, both in terms of memory space as well as, more importantly, in terms of memory access bandwidth.
In the known SoC, refresh commands are generated internally in the memory controller by a refresh command generator (RCG). Refresh commands normally have the lowest priority, but both LL and CB-requests are blocked when there is a risk of data loss in the memory. The RCG in the memory controller issues a refresh command with a regular period. Such refresh command remains pending until it is serviced by the memory controller. The RCG also counts the number of refresh commands. When a refresh command is serviced the number is decreased. There is a maximum allowed number of pending refresh commands. That number is dependent on the type of the memory (here DRAM) and the configuration of the RCG. Typical values are between 2 and 8. When the maximum allowed number of refresh commands is reached a refresh command has to be serviced. This is achieved by blocking both the LL-requests and CB-requests until the number of pending refresh commands is decreased again.
LL-requests are thus blocked when the maximum allowed number of pending requests has its maximum value and/or when the CB-account value is lower than the threshold value (boost value). CB-requests are only blocked when the maximum allowed number of pending requests has its maximum value.
As illustrated above, a problem of the known memory controller is that the internal arbitration between the low-latency (high-priority) requests and the constant bandwidth (low-priority) requests is rendered relatively complicated because of the refresh commands which may sometimes get a higher-priority than the other requests, i.e. the memory controller effectively works with three levels of priority.