Error detection codes are used in all sorts of digital communication applications to enable the receiver of a message transmitted over a noisy channel to determine whether the message has been corrupted in transit. Before transmitting the message, the transmitter calculates an error detection code based on the message contents, and appends the code to the message. The receiver recalculates the code based on the message that it has received and compares it to the code appended by the transmitter. If the values do not match, the receiver determines that the message has been corrupted and, in most cases, discards the message.
Cyclic redundancy codes (CRCs) are one of the most commonly-used types of error correcting codes. To calculate the CRC of a message, a polynomial g(X) is chosen, having N+1 binary coefficients g0 . . . gN. The CRC is given by the remainder of the message, augmented by N zero bits, when divided by g(X). In other words, the CRC of an augmented message D(X) is simply D(X)modg(X), i.e., the remainder of D(X) divided by g(X). There are many methods known in the art for efficient hardware and software implementation of CRC calculations. A useful survey of these methods is presented by Williams in “A Painless Guide to CRC Error Detection Algorithms” (Rocksoft Pty Ltd., Hazelwood Park, Australia, 1993), which is incorporated herein by reference.
FIG. 1 is a block diagram that schematically illustrates a rudimentary hardware-based CRC calculator 20, as is known in the art. To calculate the CRC of an input message, the message bits are passed through a sequence of one-bit registers 22. There are N registers, corresponding to the N+1 coefficients g0 . . . gN of the polynomial g(X). A plurality of one-bit multipliers 24 (i.e., AND gates) are loaded with the values of coefficients g0 . . . gN (wherein g0=gN=1). At each cycle, the bit output of calculator 20 is fed back through multipliers 24, and the bit output by each multiplier is added to the bit value in the preceding shift register 22 by a one-bit adder 26. As there is no carry from one adder 26 to the next, these adders function simply as XOR gates. The last N bits output by calculator 20 after the end of the augmented input bitstream are the CRC of the message.
FIG. 2 is a block diagram that schematically illustrates a more efficient, table-based CRC calculator 30, as is also known in the art. In this case, the message is input to the calculator in words that are M bits wide, which are held successively by M-bit registers 32. A table 34, typically stored in read-only memory (ROM), receives the upper M bits output by calculator 30 at each cycle, u(X), and outputs the value (u(X)*XM) mod g(X). Here XM corresponds to a shift left of M bits, and the modg(X) operation represents the remainder of the foregoing expression divided by g(X). Adders 36 in this case are implemented by M-bit parallel XOR gates. The last word u(X) output by calculator 30 after the end of the augmented message is the CRC of the message.
It is common in many networks, such as Internet Protocol (IP) networks, for the transmitter to break up messages into multiple segments for transmission, due to packet size limitations, for example. The messages are generated by a higher-level protocol, which calculates and appends the CRC to each message before transmission. The receiver can check the CRC only after it has received all of the segments of the message. If the segments arrive at the receiver in order, the CRC can be calculated at the receiver in simple pipeline fashion over the successive parts of the message as they arrive, and then compared to the CRC that was appended to the message at the transmitter. In IP networks, however, there is no guarantee that all of the segments will arrive in order at the receiver. Consequently, in implementations known in the art, the receiver must have sufficient buffer capacity to hold all of the segments until the entire multi-segment message has been received. Only then is it possible to arrange the segments in their proper order so as to calculate the CRC and determine whether to accept or reject the message.
In some applications, the buffer required to hold all of the message segments for CRC checking can be very large. An example of such an application is the Internet Small Computer System Interface (iSCSI) protocol, which maps SCSI information for transport over IP networks using the Transport Control Protocol (TCP). Prior to the transfer of the SCSI data, the iSCSI protocol breaks the iSCSI data into individual blocks called Protocol Data Units (PDUs), each of which is protected by its own CRC. These PDUs are subsequently broken down into units of data called TCP segments, which are commonly smaller than iSCSI PDUs. The TCP segments are then transferred over the network by TCP/IP, independently of one another.
On the receiver side, the TCP segments are collected and assembled into iSCSI PDUs and are then passed on for further iSCSI processing. In particular, the receiver must check the iSCSI CRC of every PDU that it receives, in order to confirm that the data are intact before passing the PDU on for further processing. The iSCSI protocol is intended to handle very high bandwidths (multi-gigabits/sec) and tolerate large delays (up to hundreds of milliseconds in wide-area networks). Since TCP/IP does not guarantee in-order delivery (i.e., the TCP segments may not be received in the order in which they were sent), before the receiver can verify the CRC of an iSCSI PDU, it must buffer the TCP segments until all the segments making up the PDU have been collected. To calculate CRCs of entire PDUs under these conditions, using methods known in the art, the iSCSI receiver requires a large, costly, high-speed buffer memory.