a. TCP and UDP Checksums
A number of different packet based protocols have been defined to enable interconnected systems to communicate with each other. For example, the Internet Protocol (IP) defines a networking layer protocol that allows packets to be routed, switched, or otherwise passed from network node to network node as they progress from a source point to a destination point within an IP network. At the transport layer, the Transmission Control Protocol (TCP) or the User Datagram Protocol (UDP) may be used to control the flow of packets from a source point to a destination point. TCP is a connection oriented protocol while UDP is a datagram oriented protocol. Either may be viewed as a transport layer protocol that can be configured to “run on top of” the IP protocol.
FIG. 1a examines this in more detail. FIG. 1a shows how a packet may be constructed (by moving downward along the Transmitting Direction of FIG. 1a) prior to its being transmitted into a network (e.g., an Ethernet network) and/or how a packet may be broken down (by moving upward along the “Receiving Direction” of FIG. 1a) after its reception from a network. The embodiment(s) illustrated in FIG. 1a therefore correspond to an IP packet being transported using the UDP or TCP protocol.
Referring to FIG. 1a, moving along the transmitting direction, application data 101a (e.g., the data that corresponds to the packet's “payload”) is encapsulated by a UDP header 102a or a TCP header 102b. For simplicity, the term “TCP/UDP” is utilized in FIG. 1a to express that either TCP or UDP may apply. FIG. 1b shows a detailed embodiment of a UDP header 102a that is prefixed (or prepended) to the application data 101a; and, FIG. 1c shows a detailed embodiment of a TCP header 102b that is prefixed to application data 101a. 
The UDP header embodiment 102a of FIG. 1b divides the header into four fields 110b, 103b, 104b, 106b. These fields identify the source port 104b of the packet, the destination port 103b of the packet, the UDP message length 106b, and a UDP checksum value 110b. A port is typically associated with each uniquely identifiable agent (e.g., a particular application, a particular user, a particular object, etc.) that uses an IP network. Ports are often viewed as the ultimate sources and/or destinations of the packets that traverse an IP network.
A single IP node (e.g., a source IP node or a destination IP node) may be configured to support multiple ports. Thus, for example, a group of packets that are directed to a particular IP destination node may be ultimately delivered to different agents that each view the destination IP node as their access point to the IP network.
Commonly, multiple ports are associated with the same machine (e.g., a computing system) wherein the machine has a specific IP address. For example, a server may have a specific IP address and the application programs running on the server may each be identified with a different port. This allows the different application programs to establish different flow arrangements with the IP network (via the TCP or UDP headers) while sharing the same access point to the IP network (via the network resources of the server and its corresponding IP address).
The TCP header embodiment 102b of FIG. 1c divides the header into a plurality of fields which identify: 1) a source port 105c; 2) a destination port 106c; 3) a sequence number 107c; 4) an acknowledgment number 104c; and 5) a TCP checksum value 110c (among other parameters such as header length, control, window, etc.). As the TCP protocol is connection oriented, it includes both a sequence number 107c and an acknowledgment number 104c so that two agents in communication with one another across an IP network can ensure that the order of packets associated with the connection between them is preserved.
Referring back to FIG. 1a, note that a UDP header 102a or a TCP header 102b is prefixed to the application data 101a. This combination may be referred to as the UDP packet 109a or TCP packet 109b, respectively. A UDP pseudo header 112a or a TCP pseudo header 112b may then be created and prefixed to the UDP packet 109a or TCP packet 109b, respectively. The pseudo header 112a, 112b is used to calculate the checksum value 110b, 110c found within the UDP or TCP headers. An embodiment of a UDP pseudo header 112a or a TCP pseudo header 112b is observed in FIG. 1d. 
A checksum is a number whose value represents the particular sequence of bits found within a block of data. As such, two identical blocks of data have the same checksum; and, two different blocks of data statistically have two different checksum values. In typical embodiments, the UDP/TCP pseudo header 112a, 112b, the UDP/TCP header 102a, 102b and the application data 101a are together viewed as the “block” of data over which the checksum is calculated. This “block” of data is effectively viewed as a succession of 16 bit integers that are summed using one's complement logic.
The end result of this addition is the checksum value 110b, 110c that is stored in the UDP/TCP header 102a, 102b of FIGS. 1b and 1c. Note that, for the checksum calculation process, a string of 0s is typically used to represent the header checksum value 110b, 110c; and, the pseudo header 112a, 112b may be “padded” as appropriate with zeros (via the zero field 154 of FIG. 1d) so that the “block” of data is evenly divided into fixed length (e.g., 16 bit) sections.
As seen in FIG. 1d, the pseudo header 112a, 112b includes IP source and destination fields 150, 151 as well as a length indicator 152, an IP Protocol Type indicator 153 and the zero padding field 154. As the IP source and destination of a packet may differ from packet to packet; and, as the application data may differ from packet to packet, different checksum values are expected for packets having a different IP source/destination pair and/or different application data content. Referring back to FIG. 1a, once the checksum value is calculated and inserted into the UDP/TCP header 102a, 102b, an IP packet 120a is formed by discarding the pseudo header 112a, 112b and appending an IP header 103a to the UDP/TCP packet 109a, 109b. 
The IP packet 120a is then presented to the particular physical media network type (e.g., Ethernet) that interconnects the sending node to the “next” node in the network. Note that in FIG. 1a, an Ethernet header 104a and trailer 111a is shown as an example of how an Ethernet packet may be constructed to carry the IP packet through an Ethernet network.
Once received at its destination, the packet is deconstructed. Moving upward in FIG. 1a along the receiving path direction, after the IP header 103a is removed, another pseudo header 112a, 112b is created (using the IP address of the destination device) and prefixed to the UDP/TCP packet 109a, 109b. The checksum that was received in the UDP/TCP header 102a, 102b may then be removed (so it can be used for comparative purposes as discussed below) and replaced with a string of 0s.
In at least one approach, the checksum is then re-calculated over the course of the pseudo header 112a, 112b, UDP/TCP header 102a, 102b, and application data 101a. The checksum calculated at the destination is then compared with the checksum that was extracted from the UDP/TCP header 102a, 102b. If the checksums match, there is a high probability that the data was not corrupted during transmission and the packet is “accepted”; otherwise, it is discarded.
In alternate approaches, the property of 1's complement addition (where X+X′=0) is taken advantage of such that the checksum is calculated over the data and the checksum. If the final answer is “0”, the checksum is deemed “good.” Other types of checksum approaches that are known or yet to be developed may also be used.
b. Software Calculation of the TCP and UDP Checksums
FIG. 2 shows an embodiment of a Central Processing Unit (CPU) 200 found within a computing system. A CPU 200a is responsible for executing software for the machine or device having the CPU 200a. A CPU typically comprises, as seen in FIG. 2, one or more processors 201a which are coupled to a system memory 202a (e.g., through a memory controller 203a as observed in FIG. 2). Note that other CPU architectures may exist that are different than that depicted in FIG. 2. For example, in distributed computing environments, a plurality of processor units is typically coupled to one or more system memory units.
In order to implement the software methodologies that execute on a CPU 200, the processor(s) 201 typically execute a plurality of instructions and manipulate a plurality of data units. The instructions and data units are found in either (or both) the system memory unit 202a and the cache unit 206a. Generally, frequently used (and/or imminently used) instructions and data units are stored in the cache unit 206a. As a result, the instruction execution logic of the processor(s) 201a has this information “nearby” so as to avoid the delay associated with retrieving it from system memory 202a. System memory 202a is typically implemented with Dynamic Random Access Memory (DRAM) cells (which are more dense but slower) while the cache unit 206a is typically implemented with Static Random Access Memory (SRAM) cells (which are less dense but faster).
In the prior art the TCP and/or UDP functionality, which includes the TCP and UDP checksum calculations described above, is implemented in software. As such, the checksum calculation process is executed via the execution of instructions and the manipulation of data units that reside in system memory 202 and/or cache 206. The intensive operations associated with the calculation of a checksum tend to hinder the performance of the CPU 200. That is, the repeated additions of fixed length sections of the application data, UDP/TCP header and pseudo header consume the resources of the system memory 202, and execution instruction logic and “pollute” the cache 206 within the processors 201 such that resources the CPU 200 can devote to other functions (e.g., application programs) is noticeably reduced.