A symmetric multiprocessing (SMP) architecture generally is a multiprocessor computer architecture where two or more identical processors can connect to a single shared main memory. In the case of multi-core processors, the SMP architecture can apply to the CPU cores.
In an SMP architecture, multiple networking CPUs or CPU cores can receive and transmit network traffic. Generally, network packets are received by NAE are passed to Packet ordering engine (POE). POE in SMP system not only helps control the order in which the packets are transmitted out from the system, but also helps control the order in which the packets are processed within the system.
Typically, to perform load balancing, the system will process a packet on a first-come first-serve round robin fashion. Thus, when the POE receives one or more packets from the NAE, before sending the packets to the CPU core for processing, the POE will set the order of the packets according when each packet is received by POE. When POE receives processed packets from the CPU core, POE will order the packets according to their set order. Thus, POE will send out processed packets in exactly the same order as those packets are received.
In addition, a conventional POE also allows for grouping packets from multiple flows by their corresponding flow identifiers. Thus, the POE can map a particular L4 flow to a POE flow. A POE flow generally represents a queue. For example, a POE may support 64K queues and 64K slots. A slot generally refers to a buffer or a packet descriptor. On the other hand, while ordering received packets, the POE can parse a packet, and classify it as either a L3 or L4 flow, and then map the packet to a queue in the POE based on the flow ID associated with the packet. Specifically, the POE can extract a key from a received packet, and configure a hashing algorithm which can be indexed to a range of flow identifiers. Therefore, essentially, the POE can map a packet to a flow. When the packet arrives at the CPU processing core, the packet will include information about both its flow identifier and its the packet descriptors. Note that, it is possible for the POE to use 64K queues with each queue corresponding to a slot. Also, the POE may use one queue that uses all of the 64K slots. Alternatively, the system may determine which ingress CPU, which a received packet will be sent to, based on the flow key without the use of the POE.
In a system where packets from the same flow are always sent to the same CPU core for processing, the proper ordering of packets within the same transport layer flow is guaranteed. However, it is difficult to guarantee correct ordering of packets from multiple transport layer flows, because these packets may be forwarded to different CPU cores for processing and thus may experience various amount of delay, which results in the packets to be transmitted out of the system out of order.