The complexity and sophistication of operating systems, application software, networking, networked communications, and the like continue to increase at dramatic rates. One result of the complexity and sophistication is increased functionality of applications and systems. This increased functionality often results in an increase in CPU overhead due to the additional duties that must be performed by the CPU to execute the increased system and application functions.
One area where the increase in CPU overhead is readily apparent is in the area of networked applications where network speeds are increasing due to the growth in high bandwidth media. Network speeds often match, and increasingly exceed, the CPU processor speed and memory bandwidth capabilities of the host computers. These networked applications further burden the host processor due to the layered architecture used by most operating systems, such as the seven-layer ISO model or the layered model used by the Windows operating system. As is well known, such a model is used to describe the flow of data between the physical connection to the network and the end-user application. The most basic functions, such as putting data bits onto the network cable, are performed at the bottom layers, while functions attending to the details of applications are at the top layers. Essentially, the purpose of each layer is to provide services to the next higher layer, shielding the higher layer from the details of how services are actually implemented. The layers are abstracted in such a way that each layer believes it is communicating with the same layer on the other computer.
Various functions that are performed on a data packet as it proceeds between layers can be software intensive, and often requires a substantial amount of CPU processor and memory resources. For instance, certain functions that are performed on the packet at various layers are extremely CPU intensive, such as packet checksum calculation and verification, encryption and decryption of data (e.g., SSL encryption and IP Security encryption), message digest calculation, TCP segmentation, TCP retransmission and acknowledgment (ACK) processing, packet filtering to guard against denial of service attacks, and User Datagram Protocol (UDP) packet fragmentation. As each of these functions is performed, the resulting demands on the CPU can greatly affect the throughput and performance of the overall computer system.
Although the demand on CPU resources grows, the capability and throughput of computer hardware peripherals such as network interface cards (NICs) and the like are also increasing. These peripherals are often equipped with a dedicated processor and memory that are capable of performing many of the tasks and functions that are otherwise performed by the CPU.
The computer industry recognized this capability and developed methods to offload CPU intensive tasks and functions that were previously performed by the CPU. For example, the commonly assigned U.S. Pat. No. 6,141,705 to Anand et al., and patent application Ser. No. 09/657,510, “Method and Computer Program Product for Offloading Processing Tasks from Software to Hardware,” filed Sep. 7, 2000, and Ser. No. 09/726,082, “Method and Computer Program Product for Offloading Processing Tasks from Software to Hardware,” filed Nov. 29, 2000 provide solutions to query peripheral devices and offload specific processor tasks to the peripheral devices that are capable of performing the intensive tasks and functions. The specific tasks typically offloaded include tasks such as TCP (Transmission Control Protocol) and or IP (Internet Protocol) checksum computation, TCP segmentation such as large send offload (LSO), and secure Internet protocol (IPSEC) encryption and decryption.
These offload mechanisms are limited in that the mechanisms have a secondary requirement that a minimum number of changes be made to the network stack. As a result of this secondary requirement, another limitation is that the offloads have a long code path because the entire network stack is traversed with the offloaded tasks and functions disabled to reach the peripheral device. A further limitation is the lack of integration with the network stack. There is no well defined interface for the network stack to query or set parameters on the peripheral device or an interface for the peripheral device to inform the network stack of any notifications or changes of capabilities. For example, if the route changes when an LSO request is being processed, the fallback mechanism is for the stack to wait for timeouts and retransmit the LSO request.
Another approach that peripheral device manufacturers tried to do was to offload the entire TCP connection from the core stack to a network interface card (NIC). This approach bypasses the entire protocol stack by using a proprietary interface and requires the peripheral device to handle all TCP messages, IP (Internet Protocol) messages, ICMP (Internet Control Message Protocol) messages, DNS (Domain Name Server) messages, and RIP messages, requiring the NIC to process everything. Additionally, this approach does not address multi-homed environments and does not cleanly integrate with the host operating system network management utilities. Once a state changes, the offloaded connection can easily fail.