The transmission control protocol/internet protocol (TCP/IP) is a protocol that has become widely used for communications. However, receiving, buffering, processing and storing the data communicated in TCP segments can consume a substantial amount of host processing power and memory bandwidth at the receiver. In a typical system, reception includes processing in multiple communications layers before the data is finally copied to a final destination (e.g., an application buffer).
A conventional network interface card (NIC) processes layer 2 (L2) headers (e.g., Ethernet headers) and then copies at least, for example, the remaining headers (layer 3 (L3) and higher) and the upper layer protocol (ULP) payload to a transport buffer (e.g., a TCP buffer) for networking and transport layer processing. The transport and networking processing (e.g., TCP/IP processing in which TCP is the transport layer protocol) typically performed by the central processing unit (CPU) removes L3 headers and L4 headers and then copies, for example, any remaining headers and the ULP payload to another buffer. The process repeats for subsequent levels until the last header is removed and then ULP payload is copied to the buffer assigned by an application.
Typically, most of bytes of the frames are payload (e.g., data), but it is difficult to know the boundary between the various headers and the payload. For the above-identified and other reasons, the payload bytes are copied repeatedly as the control portion of the frames (e.g., the headers) is processed in a layered fashion. The host CPU incurs a substantial overhead for this processing and copying including, for example, handling many interrupts and context switching. Thus, fewer cycles are available for application processing, which is the desired use of a computer (e.g., a server machine). For high-speed networking (e.g., 10 Gigabits per second), the additional copying strains the memory subsystem of the computer. For an average of three data copies, the memory subsystem of most commercially available server computers becomes a bottleneck, thereby preventing the system from supporting, for example, a 10 Gigabit network traffic. In some cases, each host copy can consume more than three times the wire bandwidth for each copy (e.g., 30 Gigabits per second for a 10 Gigabit per second network traffic). Since TCP/IP is the dominant transport protocol used by most applications today, it would therefore be useful to ease the burden of this processing to achieve, for example, scalable low CPU utilization when communicating with a peer machine.
Conventional systems may not reduce overhead, for example, by copying data once from the wire to the application buffer. Typically, the NIC cannot distinguish which portion of a received frame contains ULP data and which portion contains ULP control. Conventional senders may not build frames in a way that facilitates the receiver NIC to make such distinctions. In addition, a typical TCP sender has no mechanism that allows it to segment the byte stream based on PDU boundaries. Conventional systems may not be able to handle such complexities such as, for example, every ULP having its own method for mixing data and control, thereby making it impractical to build a NIC that can distinguish control from data for all of the ULPs or even for a substantial subset of all of the ULPs.
Conventional systems may not be able to directly move data from the TCP byte stream service to the ULP. It is not possible to tell the beginning of a ULP message (e.g., a protocol data unit (PDU)) inside that endless stream of bytes. Assuming that the frames arrive without resegmentation at the receiver (e.g., a server), it is possible that the receiver might be able to unpack the frame using TCP and might be able to locate the ULP header. The ULP header may include, for example, control information that may identify a location in the application buffer where the ULPDU may be directly placed. However, resegmentation is not uncommon in TCP/IP communications. There is no guarantee that the TCP segments will arrive, on the other end of the wire, the way the sender has built them. At present, there is no mechanism that allows the receiver to determine whether a byte stream (e.g., TCP segments) has been subject to resegmentation or whether a byte stream has been received as originally segmented by the sender. If resegmented, a conventional system may typically be unable to determine whether such resegmentation has occurred and cannot rely on the sender to build a TCP segment in a structure known to the receiver. Therefore, a conventional receiver cannot rely on sender-segmented messages to locate the ULP header and the ULPDU in a resegmented TCP byte stream.
For example, there may be network architectural structures between the sender and the receiver. For example, an intermediate box or middle box (e.g., a firewall) may terminate the TCP connection with the sender and, without the sender or the receiver being aware, may initiate another TCP connection with the receiver. The intermediate box may resegment the incoming frames (e.g., by using a smaller TCP payload). Thus, a single frame may enter the intermediate box, but a plurality of smaller frames, each with its own TCP header may exit the intermediate box or vice versa. This behavior by the middle box may disrupt even nicely placed control and data portions.
In the case of resegmentation, the conventional receiver may face a number of challenges. For example, the receiver may not be aware that there are any intermediate boxes between the sender and the receiver. The TCP sender may be considered ULP agnostic and may bear no guarantees as to the mapping of ULP messages into TCP segments or as to the segmentation of TCP segments based on the ULP message boundaries. Furthermore, the initial segmenting scheme used by the sender may not be the segmenting scheme received by the receiver. Thus, although the receiver may be able to place the smaller frames in order, the receiver may be unable to locate the ULP header and the payload without processing the ULP headers. Accordingly, the receiver may not be able to ascertain the control and payload boundary that may be necessary to correctly place the ULPDU payload in the proper location of, for example, the application buffer of the receiver.
Conventional systems may have additional challenges when receiving from TCP/IP networks prone to forwarding to the receiver segments out of order. This is more evident when the ULP has a PDU larger than a TCP segment, which may be limited, for example, to 1460 Bytes when used on top of Ethernet. Thus, the ULPDU may be split among a plurality of TCP segments. Therefore, some TCP segments may contain only data and no control information that instructs the receiving NIC as to where to place the data. Conventional systems may also include receivers that, in advance, do not know the location of the control information inside the received segment. The only way to find the location of the control information is by processing ULPDUs in a sequential fashion, locating the ULPDU header and processing the ULPDU header according to a specific protocol. The receiver is faced with a choice of either dropping the out-of-order segments and requesting retransmission, which is costly in terms of delay and performance loss, or buffering the out-of-order segments until all the missing segments have been received. Some conventional implementations may choose to accumulate the out-of-order segments, to wait for the missing TCP segments to be received, and then to place them in order. Once the TCP segments have been ordered, then the receiving NIC may commence with the processing of the whole set of TCP segments. The receiving NIC may then analyze the ULP control portions to obtain information relating to data placement. If ULP placement is not used, then an additional copy copied from a TCP temporary buffer to the ULP may be necessary. The process suffers from additional costs including, for example, a temporary buffer, a higher powered CPU and a wider data path. In the case of ULP placement, the process is protocol specific which makes it more difficult to support various protocols. The receiving NIC has to process all the accumulated out-of-order TCP segments concurrently with the reception of new TCP segments from the wire at wire speed, as traffic on the link may continue all the time. This further strains the memory interface and the NIC's header processing entity (e.g., an embedded CPU) and forces the NIC architecture to support an effectively higher data rate than the wire rate.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of ordinary skill in the art through comparison of such systems with some aspects of the present invention as set forth in the remainder of the present application with reference to the drawings.