Embodiments of the present invention relate to software-controlled caching and ordered synchronization.
Network processors are becoming a core element of high-speed communication routers, and are designed specifically for packet processing applications. In most packet processing applications, individual packets can typically be processed independently and in parallel with other packets. To take advantage of this parallelism, some network processors, including Intel® IXP network processors, contain many small, multi-threaded processing engines called microengines (MEs) for handling packets.
Individual MEs do not have any hardware caches. This increases the density of MEs that can be placed in a given area. However, network processors have a hierarchy of memory with different capacities and access latencies. For example, a network processor may include Local Memory, Scratchpad Memory, SRAM, and DRAM, each having a different latency, as shown in Table 1, below.
TABLE 1UnloadedLogical Widthlatency in MEMemory Level(bytes)Size (bytes)cyclesLocal Memory42560 3Scratchpad416K60MemorySRAM4 4M-256M 90-150DRAM864M-2G 120-300
Because MEs typically do not contain hardware caches, it is important to minimize memory accesses and access latencies in order to increase packet throughput.