Performance requirements of packet networks, like the Internet and latest generations of the mobile phone networks, are rapidly increasing. At the same time, security and other aspects require more operations on the data packets and connections. These operations are performed by firewalls, routers, intrusion prevention systems and other network appliances. The performance of these appliances is often a bottleneck in the overall performance of the packet networks.
Network appliances are either specialized or general purpose computers running appropriate software. Usually, the data packets are received by one network interface, transferred to the main memory, processed by the central processing unit using purpose-built software and transferred to another network interface for further transmission. The performance of the appliance is significantly affected by the efficiency of data transfer between the network interfaces and the main memory and further to the operating system's network stack.
A traditional way to transfer packet data from a network interface to main memory is via an interrupt handler. When a packet is available, an interrupt is raised by the Network Interface Card, NIC, hardware. The operating system will then read the packet from the card buffer. When a lot of packets must be processed, there is a lot of overhead in using this method. Every time a packet is received, the control is moved from whatever the operating system kernel was doing to the device driver managing the NIC, and after the transfer operation, back again. Subsequently, many optimization methods are used and proposed to increase the data packet rate between the network interfaces and the main memory.
In Large Receive Offload, LRO, and Generic Receive Offload, GRO, the NIC driver assembles several received data packets belonging to the same stream Transport Control Protocol, TCP, stream into a single, larger data packet, before passing the assembled data packet to the operating system's network stack.
However, for the network appliances that are operating as intermediary network nodes between hosts operating as endpoints of the communication, combining and then recreating the packets is inefficient.