In conventional networking arrangements, conventional pull or push techniques are employed to fetch packets to be transmitted from host memory. For example, in a conventional pull technique, depending upon the particular host operating system that is used, a host central processing unit (CPU) either writes the packet and its descriptor into host kernel memory, or passes the application buffer (containing the packet) to the host kernel and thereafter writes the packet's header and descriptor into the kernel memory. Thereafter, in this conventional pull technique, the CPU writes a doorbell to alert the host's network interface controller (NIC). In response to the doorbell, the NIC reads the descriptor and packet from host memory. The NIC then schedules and processes the packet for transmission, and thereafter, transmits the packet from the host.
Unfortunately, the above operations involved in carrying out this conventional pull technique introduce significant latencies in obtaining, processing, and transmitting the packet from the NIC. As can be readily appreciated, these latencies are undesirable, especially in the case of latency intolerant and/or critical traffic.
In a conventional push technique, the CPU provides an implicit doorbell by copying the packet and/or descriptor directly into the NIC memory. In response, the NIC schedules and processes the packet for transmission, and thereafter, transmits the packet from the host.
Unfortunately, the above operations involved in carrying out this conventional push technique do not scale well with increased packet traffic and/or CPU threads requesting packet transmissions. This results, at least in part, from the fact that this conventional push technique utilizes significant amounts of NIC (e.g., on-die) memory for each such packet transmission/transaction. Additionally, depending upon the host bus protocol/internal transport architecture employed, intensive use of posted writes over the bus/internal transport mechanism (e.g., to the NIC memory) may be involved for each packet transmission/transaction. This may degrade host CPU performance, particularly when CPU instruction re-ordering occurs in a way that makes it difficult to match corresponding completed posted writes.
Additional disadvantages of this conventional push technique can become apparent when the transmit traffic is bursty, thereby generating significant amounts of such traffic in relatively short time intervals. For example, such bursty traffic can completely fill the NIC memory that is devoted to packet pushing transactions, thereby stalling CPU threads from being able to send additional packets to the NIC for transmission, until sufficient NIC memory space has been freed (e.g., after other packets currently stored in the NIC's memory have been transmitted from the host). This can result in significant performance degradation. This condition can have particularly pernicious affects on latency intolerant and/or critical traffic. Additional NIC memory can be provided to try to ameliorate this problem. However, adding NIC memory increases NIC cost.
Although the following Detailed Description will proceed with reference being made to illustrative embodiments, many alternatives, modifications, and variations thereof will be apparent to those skilled in the art. Accordingly, it is intended that the claimed subject matter be viewed broadly.