In order to reduce the latency associated with accessing data stored in main memory, processors (such as CPUs or GPUs) typically have one or more caches, as shown in the example memory hierarchy 100 in FIG. 1. There are typically two levels of on-chip cache, L1 102 and L2 104 which are usually implemented with SRAM (static random access memory). The caches are smaller than the main memory 108, which may be implemented in DRAM (dynamic random access memory), but the latency involved with accessing a cache is much shorter than for main memory, and gets shorter at lower levels within the hierarchy (i.e. closer to the processor). As the latency is related, at least approximately, to the size of the cache, a lower level cache (e.g. L1) is smaller than a higher level cache (e.g. L2).
When a processor accesses a data item, the data item is accessed from the lowest level in the hierarchy where it is available. For example, a look-up will be performed in the L1 cache 102 and if the data is in the L1 cache, this is referred to as a cache hit and the data can be loaded into one of the registers 110. If however, the data is not in the L1 cache (the lowest level cache), this is a cache miss and the next levels in the hierarchy are checked in turn until the data is found (e.g. L2 cache 104 is checked in the event of a L1 cache miss). In the event of a cache miss, the data is brought into the cache (e.g. the L1 cache 102) and if the cache is already full, a replacement algorithm may be used to decide which existing data will be evicted (i.e. removed) in order that the new data can be stored.
If a data item is not in any of the on-chip caches (e.g. not in the L1 cache 102 or the L2 cache 104 in the hierarchy shown in FIG. 1), then a memory request is issued onto an external bus (which may also be referred to as the interconnect fabric) so that the data item can be obtained from the next level in the hierarchy (e.g. the main memory 108).
The embodiments described below are provided by way of example only and are not limiting of implementations which solve any or all of the disadvantages of known methods of managing access to memory.