A cyclic redundancy check (CRC) is an error-detecting code commonly used in digital networks and storage devices to detect accidental changes to raw data. Blocks of data entering these systems get a short check value attached, based on the remainder of a polynomial division of their contents; on retrieval the calculation is repeated, and corrective action can be taken against presumed data corruption if the check values do not match. CRCs are so called because the check (data verification) value is a redundancy (it expands the message without adding information) and the algorithm is based on cyclic codes. CRCs are popular because they are simple to implement in binary hardware, easy to analyze mathematically, and particularly good at detecting common errors caused by noise in transmission channels.
The simplest CRC designs to implement in hardware are the LFSR circuits (Linear Feedback Shift Register). Serial LFSR circuits are small and compact, but it is capable of processing only one bit per cycle. In order to meet the higher throughput requirements of modern communications applications, parallelism in the CRC calculation has become popular. Typically, one byte or multiple bytes of data are processed in parallel in these CRC architectures. A common method used to achieve the required parallelism is to unroll the serial implementation.
Most CRC implementations contain a feedback loop in which a partial CRC result is updated every clock cycle to incorporate the next piece of the message that arrives. This one cycle loop is often the critical cycle limiting performance. For serial LFSR circuits, the feedback loop does not pose a major timing challenge because the computation that needs to be performed in the feedback loop is simple and straight forward. This is not the case for parallel CRC implementations, however. Typical algorithms used for implementing parallel CRC significantly increase the length of the worst case timing path, and this worst case timing path is usually in the feedback loop of the CRC circuit. Since the feedback loop in these CRC algorithms are one-cycle loops, one cannot overcome the worst case timing path by merely introducing more pipeline stages into the feedback path. As a result, most parallel CRC designs are unable to attain their theoretical speed-up over serial LFSRs, even after consuming significantly more area and power.
What is needed is a method for designing a parallel CRC circuit in which the critical cycle in the CRC feedback path can be extended by an arbitrary number of cycles. The extended cycle would allow the CRC circuit to run at higher clock frequency and significantly boost the performance of the parallel CRC circuit.