1. Field of the Invention
This invention relates to reassembly of data fragments of fragmented datagrams in a communication system. In particular, the invention relates to reducing and/or detecting a likelihood of misassembly of data fragments in a communication system utilizing the Internet Protocol (IP) caused by IP identification wraparound.
2. Description of the Related Art
The Internet Protocol (IP) has become one of the most widely used communication protocols in the world. IP is part of a layered protocol, which means that another higher level protocol typically uses IP for data communication. Examples of such higher level protocols are the Transfer Control Protocol (TCP) and the User Datagram Protocol (UDP). In addition, even higher level protocols are sometimes utilized, such as the Network File System (NFS). These protocols are well known to those skilled in the art. The protocols are used to send data from a sending station (e.g., a client or a server on a sending end of a communication) to a receiving station (e.g., a client or a server on a receiving end of a communication), possibly through one or more routing devices that form an IP path.
In order to send a TCP, UDP or other protocol datagram across an IP connection, the datagram is encapsulated in an IP datagram. Often, the IP datagram must be fragmented into plural IP data fragments in order to be sent using the physical network. For example, if a size of the datagram exceeds the physical link's maximum transfer unit (MTU), that datagram must be fragmented into plural IP data fragments with sizes that do not exceed the MTU. Then, a receiving station reassembles the data fragments into the datagram.
A receiving station determines that data fragments belong to a single IP datagram by looking at an IP identification number in a header of each data fragment. All data fragments from the same IP datagram share the same IP identification number. In addition, the header of each data fragment includes an offset from the start of the datagram, a length of the data fragment, and a flag that indicates whether or not the datagram includes more data fragments. This information is sufficient for reassembly of the IP datagram, which includes the original TCP, UDP or other protocol datagram.
According to IP, the IP identification number is 16 bits long with a range of 0 to 65535. A sending station conventionally uses a simple counter to determine the IP identification number for each IP datagram. In the early days of IP communications, a receiving station most likely would receive all data fragments of a datagram with a particular IP identification number and reassemble the datagram well before this counter could wrap around. If a data fragment was lost, thereby making reassembly of a datagram impossible, all received data fragments of that datagram would be discarded after a timeout of 64 seconds. With the slower communications times that existed in the early days of the IP communications, this timeout was usually sufficient to ensure data fragments would be discarded before the counter at the sending station could wrap around.
However, today's Internet communications are much faster. Gigabit and 100 Mb Ethernet implementations are commonplace, and faster implementations are constantly being developed. As the communications speed increases, the number of IP datagrams sent by a sending station per unit of time also increases. Thus, the simple 16-bit counter conventionally used to generate IP identification numbers wraps around much more quickly. In fact, in a high speed setting, the counter can almost be guaranteed to wrap around within 64 seconds. Thus, a receiving station can receive data fragments from two different IP datagrams that share a common IP identification number before a first one of those datagrams is reassembled.
Because of the nature of IP communications, it is possible for a data fragment from a second one of two datagrams to arrive at the receiving station before a corresponding data fragment from a first one of the two datagrams. Then, if the two datagrams share a common IP identification number due to wraparound of the sending station's IP identification number counter, the receiving station can misassemble the data fragments. This misassembly can result in corruption of the datagram.
For example, if first datagram A is fragmented into data fragments A1, A2, A3, A4 and A5, and second datagram B is fragmented into data fragments B1, B2, B3 and B4, it is possible for a receiving station to receive the data fragment B2 before data fragment A2. Then, if datagram A and datagram B share a common IP identification number due to wraparound of the sending station's IP identification number counter, the receiving station can misassemble data fragments A1, B2, A3, A4 and A5 into a datagram, which of course would not contain the proper data.
Higher level protocols such as TCP and UDP utilize checksums and length checks in an attempt to catch such data corruption. However, the UDP checksum is only 16 bits long. It has been found that in a high speed environment, IP misassembly errors might occur with sufficient frequency that eventually a “false positive” checksum can result. In this case, the checksum can indicate that the UDP datagram has been properly reassembled, while in fact the datagram has been corrupted. Other properties of conventional IP exacerbate this situation, such as IP's acceptance of overlapping data fragments during datagram reassembly. In a UDP communication setting, these types of errors can lead to undetected data corruption. This data corruption might only come to light when the data is actually utilized, a situation that preferably should be avoided.