On Apr. 4, 2006, a notification was published referencing a problem with the standard internetworking protocol Ipv4 as defined in RFC 791 maintained by the Internet Engineering Task Force (IETF). The reference written by J. Heffner, M. Mathis, and B. Chandler (herein incorporated by reference in its entirety) describes how datagrams transferred over Ipv4 can suffer data corruption due to issues relating to the datagram identification field within an Ipv4 header. The datagram corruption results from having a limited number of available datagram identifiers during the lifetime of the datagram. No solution to the problem was offered by the authors.
IPv4 can transfer a datagram having a size up to 65,535 (216) bytes in size. An Internetworking Protocol (IP) layer of a protocol stack including IPv4 can assign the datagram a 16-bit identifier which implies there are 65,535 possible datagram identifiers. Typically, an IP datagram identifier is implemented as a counter that is incremented every time a datagram identifier is used. When the counter reaches its maximum value, 65,535 for a 16-bit counter, the value returns to zero. The IP will fragment a large datagram into smaller chunks to send the fragments over a media, Ethernet for example. If Ethernet supports 1500 byte frames, then the IP layer will create up to 44 frames where each frame has the same datagram identifier and has an offset into the datagram. The datagram identifier and offset information are used by a remote host to reassemble the datagram before passing the datagram to the upper layers of the communication stack. For high speed media a host could send more than 65,535 datagrams in a very short time causing the host to wrap the value of the datagram identifier counter. For example, if Ethernet running at 1 Gigabit per second (Gbps) is used, the datagram identifier counter could wrap in less than 1 second assuming relatively small datagrams. Most communication stacks hold a datagram for reassembly from 30 seconds to 120 seconds. Therefore, if a datagram having a specific datagram identifier is stored in memory for reassembly and one of its fragments is lost, then a subsequent, different datagram fragment having the same identifier could cause a corruption of the first datagram. The corruption occurs because the receiving host interprets the fragment from the second datagram as belonging to the first datagram because it has the same datagram identifier and offset. This problem applies to TCP, UDP, ICMP, or other data transported over IP. The problem can be characterized as resulting from using a pool of datagram identifiers where the pool has a limited number of available datagram identifiers while the datagram is alive in the system.
Interestingly, Zetera™ Corporation, a producer of network storage technology, encountered and resolved the datagram corruption problem in the same time frame as the public release of the problem statement. Zetera discovered the problem while running a proprietary storage protocol, the Z-SAN™ protocol, over UDP/IP on a 1 Gbps Ethernet system. Zetera has created a solution that solves the problem as described in Zetera U.S. provisional patent application assigned Ser. No. 60/791,051 field on Apr. 10, 2006 herein incorporated by reference in its entirety.
Further research regarding the datagram corruption issue indicates that the problem has manifested itself as far back as 1987 when customers of Sun's Network File System (NFS) implementation suffered from data corruption. NFS used 8 KB UDP datagrams that would become corrupted for the reason described above. Implementations of NFS addressed the problem by shortening the time NFS waits for a response at the application layer, or through large checksum values on the datagram (32-bit checksums or greater). Such solutions mitigate the risk of loss, but do not create a solution for the problem. In addition, short timeouts reduce efficiency of the system because the system must conduct additional retries.
U.S. Pat. No. 6,894,976 titled “Prevention and detection of IP identification wraparound errors” teaches a method of reducing the risk associated with the problem of IP datagram identifier wrap around through the use of timeouts and checksums. However, this reference does not present a viable solution for the problem that applies to all IP based applications.
The described problem is an inherent part of standardized IPs and can not be resolved universally without changing the standard. However, it is desirable to have a real solution that can resolve the issue in a manner that applies to network storage, other network devices, or network applications. It is contemplated that a real solution would be adopted by the standards. A desirable solution should have the following characteristics:                The solution should operate at the IP layer so that applications do not have to change.        The solution should have backward compatibility so if any changes are required in the standard, then new versions of a communications stack could operate with legacy stacks.        The solution should require minimal changes to existing stacks.        The solution should be future proof with respect of the ever increasing rate of data transfer (10 Gpbs, 100 Gpbs, etc. . . . ).        The solution should couple a datagram identifier with the datagram lifetime to aid in controlling the use of the datagram identifiers.        
Clearly there remains a long felt need for a solution to the datagram corruption problem. Preferably the solution completely addresses the problem rather than simply mitigating the risk of the problem occurring.