Conventional TCP processing is exemplified by systems and methods developed to accelerate data transfer between a client and a server. Software implementations executed on a host processor, e.g., Central Processing Unit (CPU), are comparatively inexpensive and slow, compared with expensive dedicated hardware implementations designed to offload TCP processing from the host processor.
FIG. 1 is a block diagram of an exemplary embodiment of a prior art computer system generally designated 100 including a CPU 110 and a Network Interface Card (NIC) 150. Computing System 100 may be a desktop computer, server, laptop computer, palm-sized computer, tablet computer, game console, cellular telephone, computer based simulator, or the like. A Bus 112 coupling CPU 110 to a System Controller 120 may be a front side bus (FSB). Accordingly, Computing System 100 may be a hub-based architecture, also known as an INTEL® hub architecture, where System Controller 120 is a memory controller hub and an I/O Bridge 140 is coupled to System Controller 120 via a Hub-to-hub Interface 126. System Controller 120 is coupled to System Memory 130 via a Memory Bus 132. I/O Bridge 140 includes a controller for Peripheral Component Interface (PCI) Bus 182 and may include controllers for a System Management Bus 142, a Universal Serial Bus 144, and the like. I/O Bridge 140 may be a single integrated circuit or single semiconductor platform. Examples of System Controller 120 known in the art include INTEL® Northbridge. Examples of I/O Bridge 140 known in the art include INTEL® Southbridge or an NVIDIA® Media and Communications Processor (MCP) chip.
NIC 150 may share PCI bus 182 with one or more PCI Devices 180. NIC 150 includes a PCI Interface 175, a Dedicated Processor 155, a Medium Access Controller (MAC) 165, a Dedicated Memory 160, and an ETHERNET Interface 170 to interface to an ETHERNET Network 172. The software Driver 119 for NIC 150 communicates between NIC 150 and Application Program 117, both of which are executing on CPU 110. An Application Memory Space 125, a TCP Stack Memory Space 145, and a Driver Memory Space 135 are allocated within System Memory 130.
Conventionally NIC 150 uploads processed TCP frame data to Driver Memory Space 135 and generates an interrupt to inform TCP Stack 115 that the processed TCP frame data is available. Application Program 117 then copies the processed TCP frame data from Driver Memory Space 135 to Application Memory Space 125. During the copy, System Memory 130 may be unavailable to NIC 150 for either uploading or downloading, possibly impacting transmit or receive performance of NIC 150. Furthermore, copying the processed TCP frame data requires several clock cycles, adding latency between the processing of the TCP frame data by NIC 150 and receipt of the processed TCP frame data by TCP Stack 115.
Therefore, there is a need for a partial hardware implementation that optimizes TCP processing by offloading some tasks from a host processor and reduces the need to copy processed TCP frame data within system memory.