A purpose of error detection techniques, such as techniques based on cyclic redundancy codes (CRC's), is to enable the receiver of a message transmitted through a noisy channel to determine whether the message has been corrupted. To do this, the transmitter generates a value (called a Frame Check Sequence or FCS) that is a function of the message, and typically appends the FCS to the message. The receiver can then use the same function used to generate the FCS of the received message to see if the message was correctly received.
With CRC algorithms, message bits are treated as binary coefficients of an n-bit polynomial. The message polynomial is multiplied by xm, where m is the CRC polynomial (i.e., “generator polynomial”) order. The result of the multiplication is divided by the CRC polynomial. Most implementations use a method that simultaneously executes the multiplication by xm and the division by the CRC polynomial, rather than doing these operations in sequential order. The result of these operations is the FCS, which is typically complimented and appended to the message. In some cases, the FCS is not complimented, and occasionally the FCS is put in another location, such as in a header field.
The receiver divides the received message with the appended FCS by the CRC polynomial. Assuming that the FCS was complimented before being appended to the message, and that no errors occurred during transmission, the result of the division at the receiver will be a fixed value equal to the result of dividing the order 2m polynomial (with coefficients of 1 for the upper m terms, and coefficients of 0 for the lower m terms) by the CRC polynomial. This fixed value is sometimes called the “magic number.” If the result of the division is not equal to the magic number, this indicates that an error occurred.
Most software-based CRC algorithms process one byte of data at a time. A reason for processing one byte at a time appears to be the belief that large tables are needed for handling more bytes at a time, and also that a byte at a time circuit is typically used anyway in order to handle misaligned bytes, so the same circuit may be used to process all bytes. If the traditional method is extended to process sixteen bits at a time, a lookup table of 216 entries might be used, consuming 256 kilobytes for a 32-bit CRC. If the traditional method is extended to process thirty-two bits at a time, the lookup table would consume sixteen gigabytes for a 32-bit CRC.
A problem with software CRC computations using traditional methods is that multiple instructions are executed per data byte, which makes the software CRC computation very burdensome. Many processors operate more efficiently when they can do word-wide computations.