1. Field of the Invention
The present invention relates generally to an improved data processing system, and in particular to a computer implemented method and data processing system for the detection and recovery of a network interface card (NIC) transmission control protocol (TCP) acceleration offload failure.
2. Description of the Related Art
Data communications have grown exponentially in recent years because of enhanced communications standards and network availability. Transmission control protocol (TCP) is a standard that ensures that packets of data are delivered and received in the same order they were sent and Internet Protocol (IP) is used in conjunction with TCP to designate how information travels between systems across the Internet. Most computers have a network interface card that uses the TCP/IP protocols to send and receive information through a network.
In a traditional network interface card, the TCP stack uses the system processor to break a TCP segment down into Ethernet frames before passing the data to the network interface card. This requires a large amount of processing time, especially in a Gigabit network where a network interface card can generate tens of thousands of interrupts per second. These interrupts utilize even more processor cycles.
While local area network (LAN) technology, Ethernet in particular, has improved the media speed tenfold every 3 to 4 years, the central processing unit (CPU) speed doubles every other year. Consequently, the CPUs are becoming the bottleneck at a rapid rate in high input/output (I/O) performance systems. To alleviate this lag in processor performance, an increasing number of native host functions can be offloaded to I/O adapters to accelerate data throughput. Throughput is a measure of the amount of data transferred in a specific amount of time. Offloading functions reduces the host CPU workload and has the added benefit of improving the I/O adapter throughput.
One TCP acceleration offload function is TCP segmentation offload (TSO). In TCP segmentation offload, also known as “large send offload” (LSO), the host TCP protocol stack creates a large TCP segment, up to 64 KB in size. This large segment is then passed to the IP Protocol stack, where the segment is encapsulated in a single IP packet. The encapsulated segment is then passed to the network interface card device driver and finally to the network interface card for transmission. The network interface card which implements TCP segmentation offload then resegments this single large TCP segment into multiple smaller TCP segments which are typically 1460 bytes for a standard Ethernet connection and inserts the necessary Ethernet/IP/TCP header information for each segment. The performance benefit of using segmentation offloading is gained by the fact that larger packets can be built by the host TCP stack, which typically translates into reduced host processor utilization. An additional performance benefit is gained by virtue of the fact that, in general, larger PCI data transactions translate into higher PCI bus throughput. Since the work of segmenting the buffer into Ethernet frames is done by the network interface card, the processor is available to perform other tasks.
Another TCP acceleration offload function is TCP checksum offload (TCO). In TCP checksum offload, the network interface card which implements TCP checksum offload performs the calculation of the TCP checksum instead of the host CPU. TCP checksum offload can significantly reduce host CPU workload because the task of performing a checksum of the TCP payload, TCP header, and IP header is offloaded to the network interface card. The host protocol layer may optionally calculate a TCP pseudo header checksum (depending on the specific requirements of the network interface card) and places the value in the checksum field. The network interface card may then calculate the correct TCP checksum without having to reference the IP header.
When the TCP acceleration offload functions operate as intended, network and system performance may be significantly enhanced. Thus, many operation systems take advantage of these acceleration features, including AIX® (Advanced Interactive eXecutive), a product of IBM® Corporation.
However, a significant limitation of current TCP acceleration offload functions is that severe problems can result when these acceleration offload functions fail to operate correctly. For example, the TCP checksum generator logic in a network interface card may transition to a “bad” state due to a failure in the card's hardware state machine or microcode which implements this logic. In this situation, every TCP checksum offload packet sent by the adapter would have an invalid TCP checksum, which would result in these packets being discarded by the destination host. Thus, when acceleration offload functions fail to operate as intended, severe network degradation can occur, often to the point where the network appears to be practically unusable.