A data receiver needs to be able to receive and process incoming packets sufficiently quickly so that there is no significant buildup, or bottleneck, at each stage of the processing. Such bottlenecks have occurred because of, for example, a relatively large number of operations required to be performed by a central processing unit (CPU). As the speed of CPUs has increased, more than compensating for these bottlenecks, the bottlenecks have moved to other places in the processing chain.
The Transport Control Protocol (TCP) is a connection based packet protocol between two endpoints. Each endpoint needs to perform a set of operations, termed TCP termination, on receiving TCP packets in order to support the protocol. Typically, until relatively recently, TCP termination operations have been performed in software, under direction of a CPU. As data transfer rates have increased such software driven terminations have become bottlenecks, and have been transferred to hardware, typically in the form of a printed circuit card or an application specific integrated circuit (ASIC). Hardware for performing the terminations is termed a TCP off-load engine (ToE).
Terminating hardware such as a ToE is typically coupled to an Ethernet network. The hardware strips off headers from incoming packets, and transfers the payload of the packets to a host system. The payload is stored in a first, data, memory until the host system accepts it, or until missing packets have been received by the ToE, so that the ToE can send the data to the host in the original transmitted order. The size of the data memory needed is proportional to the product of the network rate and the network round trip delay (since all incoming data has to be acknowledged), leading to the need for large, of the order of hundreds of megabits, memories. Such memories are not practical for current ASIC technologies. In addition to requiring large memory size, memories for terminating hardware need fast access rates, since received data has to be written into, then read from, the memory at the network rate. If the memory is also used for temporarily storing transmitted data, the latter also has to be written into, then read from, the memory. The memory thus needs an access rate of the order of four times the network rate.
The headers comprise Ethernet, Internet Protocol (IP), and TCP layers, as well as optional higher layers such as an Internet Small Computer System Interface (iSCSI) layer. A second, context, memory acts as a database of connections maintained by the host system, the database comprising parameters for the state of each connection. The context memory, for example, maintains the last sequence number of received TCP segments. Other layers, such as the iSCSI layer, require the context memory to maintain parameters relevant to these connections.
For a system having relatively few connections, the context memory may be implemented within an ASIC as an on-chip memory. When larger numbers of connections need to be supported, the context memory may require use of an external memory. When external memory is used, the access rate to the external memory becomes an important consideration. The access rate is linearly dependent on the incoming packet or segment rate, and this is variable. For example, if a large numbers of short packets are received, the incoming packet rate, and hence the external memory access rate, is high. In order to implement such high access rates, very large numbers of data bus pins must be used, and such large numbers may be difficult to implement. Thus, an efficient hardware implementation of a ToE requires high access rates both for context and data memories, and such an implementation may be costly and may not even be practical for the high efficiencies required.